Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Low accuracy of TF-Lite model for Mobilenet (Quantization aware training)

See original GitHub issue

Describe the bug The accuracy of TF-Lite model becomes extremely low after the quantization aware training of tf.keras.applications.mobilenet (v1/v2).

System information

TensorFlow installed from (source or binary): binary

TensorFlow version: tf-nightly-gpu (2.2.0.dev20200420)

TensorFlow Model Optimization version: 0.3.0

Python version: 3.6.9

Describe the expected behavior The accuracy of Keras model (with quantization aware training) and TF-Lite model are almost the same. Image classification with tools

Describe the current behavior

Train using the tf_flowers dataset.
Train Mobilenet V2 model without quantization aware training.
After training, Create a quantized model using quantize_model api and train with quantization aware training.
Check accuracy of test set with evaluate api
- Keras model without quantization aware training: 0.99
- Keras model with quantization aware training: 0.97
Convert to TF-Lite model and check the accuracy with a test set. Accuracy is extremely low: 0.20%

If the model is defined as follows, the accuracy of Keras model and TF-Lite model will be almost the same.

  # extract image features by convolution and max pooling layers
  inputs = tf.keras.Input(shape = (IMG_SIZE, IMG_SIZE, 3))
  x = tf.keras.layers.Conv2D(32, kernel_size=3, padding="same", activation="relu")(inputs)
  x = tf.keras.layers.MaxPooling2D(pool_size=(2, 2))(x)
  x = tf.keras.layers.Conv2D(64, kernel_size=3, padding="same", activation="relu")(x)
  x = tf.keras.layers.MaxPooling2D(pool_size=(2, 2))(x)
  # classify the class by fully-connected layers
  x = tf.keras.layers.Flatten()(x)
  x = tf.keras.layers.Dense(512, activation="relu")(x)
  x = tf.keras.layers.Dense(info.features['label'].num_classes)(x)
  x = tf.keras.layers.Activation("softmax")(x)
  model_functional = tf.keras.Model(inputs=inputs, outputs=x)

Code to reproduce the issue (Google Colab notebook) https://gist.github.com/NobuoTsukamoto/b42128104531a7612e5c85e246cb2dac

Screenshots

Additional context

Issue Analytics

State:
Created 3 years ago
Comments:19 (7 by maintainers)

Top GitHub Comments

4reactions

nutsiepullycommented, May 5, 2020

We’ve found the issue. One of the quantized kernel activation ranges had a problem, but was getting hidden when the range has converged.

We’ll have a fix out soon. tf-nightly should have it.

2reactions

nutsiepullycommented, May 7, 2020

Thanks a lot @sayakpaul. Really appreciate the feedback and the effort.

Thanks @kmkolasinski and @NobuoTsukamoto for the detailed bug reports and feedback. I’m closing the bug. Please reopen if you face any further issues.

@sayakpaul, the report is awesome! Great work, this explains the value of the tooling really well.

Top Results From Across the Web

Quantization aware training - Model optimization - TensorFlow

Quantization aware training emulates inference-time quantization, creating a model that downstream tools will use to produce actually quantized ...

Quantization (post-training quantization) your (custom ...

Quantization your model .h5 or tflite using TensorFlow Lite (Photo ... as the standard model but: faster, smaller, with similar accuracy.

Why the accuracy of TF-lite is not correct after quantization

This is because the model was not trained using the Quantization-aware training technique (which you linked in your comment).

Quantization of TensorFlow Object Detection API Models

Quantization-aware training gives less accuracy drop compared to post-training quantization and allows us to recover most of the accuracy ...

Guide to Quantization using the TensorFlow Model ...

On the other hand, quantization aware training (QAT), emulates quantized ... This helps the network be finetuned such that the accuracy drop is...

Troubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.

Start Free

Top Related Reddit Thread

No results found

Top Related Tweet

No results found

Top Related Dev.to Post

No results found

Layer up_sampling2d_36:<class 'tensorflow.python.keras.layers.convolutional.UpSampling2D'> is not supported. ou can quantize this layer by passing a `tfmot.quantization.keras.QuantizeConfig` instance to the `quantize_annotate_layer` API.

Low accuracy of TF-Lite model for Mobilenet (Quantization aware training)

Issue Analytics

Top GitHub Comments

Top Results From Across the Web

Top Related Medium Post

Top Related StackOverflow Question

Troubleshoot Live Code

Top Related Reddit Thread

Top Related Hackernoon Post

Top Related Tweet

Top Related Dev.to Post

Top Related Hashnode Post

Layer up_sampling2d_36:<class 'tensorflow.python.keras.layers.convolutional.UpSampling2D'> is not supported. ou can quantize this layer by passing a `tfmot.quantization.keras.QuantizeConfig` instance to the `quantize_annotate_layer` API.

How to convert a model (trained with quantization awareness) to int8?