question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Low accuracy of TF-Lite model for Mobilenet (Quantization aware training)

See original GitHub issue

Describe the bug The accuracy of TF-Lite model becomes extremely low after the quantization aware training of tf.keras.applications.mobilenet (v1/v2).

System information

TensorFlow installed from (source or binary): binary

TensorFlow version: tf-nightly-gpu (2.2.0.dev20200420)

TensorFlow Model Optimization version: 0.3.0

Python version: 3.6.9

Describe the expected behavior The accuracy of Keras model (with quantization aware training) and TF-Lite model are almost the same. Image classification with tools

Describe the current behavior

  • Train using the tf_flowers dataset.
  • Train Mobilenet V2 model without quantization aware training.
  • After training, Create a quantized model using quantize_model api and train with quantization aware training.
  • Check accuracy of test set with evaluate api
    • Keras model without quantization aware training: 0.99
    • Keras model with quantization aware training: 0.97
  • Convert to TF-Lite model and check the accuracy with a test set. Accuracy is extremely low: 0.20%

If the model is defined as follows, the accuracy of Keras model and TF-Lite model will be almost the same.

  # extract image features by convolution and max pooling layers
  inputs = tf.keras.Input(shape = (IMG_SIZE, IMG_SIZE, 3))
  x = tf.keras.layers.Conv2D(32, kernel_size=3, padding="same", activation="relu")(inputs)
  x = tf.keras.layers.MaxPooling2D(pool_size=(2, 2))(x)
  x = tf.keras.layers.Conv2D(64, kernel_size=3, padding="same", activation="relu")(x)
  x = tf.keras.layers.MaxPooling2D(pool_size=(2, 2))(x)
  # classify the class by fully-connected layers
  x = tf.keras.layers.Flatten()(x)
  x = tf.keras.layers.Dense(512, activation="relu")(x)
  x = tf.keras.layers.Dense(info.features['label'].num_classes)(x)
  x = tf.keras.layers.Activation("softmax")(x)
  model_functional = tf.keras.Model(inputs=inputs, outputs=x)

Code to reproduce the issue (Google Colab notebook) https://gist.github.com/NobuoTsukamoto/b42128104531a7612e5c85e246cb2dac

Screenshots

Additional context

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:19 (7 by maintainers)

github_iconTop GitHub Comments

4reactions
nutsiepullycommented, May 5, 2020

We’ve found the issue. One of the quantized kernel activation ranges had a problem, but was getting hidden when the range has converged.

We’ll have a fix out soon. tf-nightly should have it.

2reactions
nutsiepullycommented, May 7, 2020

Thanks a lot @sayakpaul. Really appreciate the feedback and the effort.

Thanks @kmkolasinski and @NobuoTsukamoto for the detailed bug reports and feedback. I’m closing the bug. Please reopen if you face any further issues.

@sayakpaul, the report is awesome! Great work, this explains the value of the tooling really well.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Quantization aware training - Model optimization - TensorFlow
Quantization aware training emulates inference-time quantization, creating a model that downstream tools will use to produce actually quantized ...
Read more >
Quantization (post-training quantization) your (custom ...
Quantization your model .h5 or tflite using TensorFlow Lite (Photo ... as the standard model but: faster, smaller, with similar accuracy.
Read more >
Why the accuracy of TF-lite is not correct after quantization
This is because the model was not trained using the Quantization-aware training technique (which you linked in your comment).
Read more >
Quantization of TensorFlow Object Detection API Models
Quantization-aware training gives less accuracy drop compared to post-training quantization and allows us to recover most of the accuracy ...
Read more >
Guide to Quantization using the TensorFlow Model ...
On the other hand, quantization aware training (QAT), emulates quantized ... This helps the network be finetuned such that the accuracy drop is...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found