Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Per-Tensor quantization support for Conv2D layers

See original GitHub issue

System information

TensorFlow version (you are using): TF nightly 2.3
Are you willing to contribute it (Yes/No): No

Motivation

I’ve been testing TF QAT features by following the tutorials and guides on the following website:

https://www.tensorflow.org/model_optimization/guide/quantization/training_comprehensive_guide

To my understanding is that TF only have per-axis support for Conv2D layers and still working on per-tensor support. Right now, I’m working with a deployment target that requires per-tensor quantization for Conv2D, and just simply passing a CustomQuantizeConfig class to Conv2D layer and changing the weight quantizers Per-axis to False will cause errors with the TF quantize API.

Hence I’m wondering if there are any resources or additional experimental features that I can try out to perform per-tensor quantization for Conv2D layers?

Issue Analytics

State:
Created 3 years ago
Comments:13 (3 by maintainers)

Top GitHub Comments

1reaction

LLNLanLeNcommented, Jun 25, 2020

Hi @nutsiepully , thank you for getting back to me. I’m wondering if there are any example that can help me quantize the Conv2D weight per-tensor, instead of per-axis? The examples in the comprehensive QAT guide is only for Dense Layers, and it’s not directly applicable for Conv2D layer.

I’ve been using this configurations for Conv2D which is called Default8BitConvQuantizeConfig, that I found here:

https://github.com/tensorflow/model-optimization/blob/fcaa2306d62a419c5bce700275748b8b08711dbc/tensorflow_model_optimization/python/core/quantization/keras/default_8bit/default_8bit_quantize_registry.py#L486

I ended up modifying the line self.weight_quantizer = default_8bit_quantizers.Default8BitConvWeightsQuantizer() (which is per-axis by default) to per-tensor by setting the argument per-axis = False:

https://github.com/tensorflow/model-optimization/blob/fcaa2306d62a419c5bce700275748b8b08711dbc/tensorflow_model_optimization/python/core/quantization/keras/default_8bit/default_8bit_quantizers.py

Unfortunately by simply changing that has caused a size mismatch since some of the underlying works are based on the assumption that Conv2D is per-axis quantization

0reactions

LLNLanLeNcommented, Oct 22, 2022

@danielmimimi hey, I have moved away from TF framework for a while now , hence I cannot recall the issue I’ve come across here.

Top Results From Across the Web

TensorFlow Lite 8-bit quantization specification

This is intended to assist hardware developers in providing hardware support for inference with quantized TensorFlow Lite models.

Practical tips for better quantization results | Heartbeat

Per-tensor means we're using one scale factor for one layer (all the channels). Per-channel quantization means we have different scale factors for each...

Basics — TensorFlow 2.x Quantization Toolkit 1.0.0 ...

This toolkit supports only Quantization Aware Training (QAT) as a quantization method. ... Original Keras layers are wrapped into quantized layers using ...

Quantization — PyTorch 1.13 documentation

In addition, PyTorch also supports quantization aware training, which models ... import torch # define a floating point model where some layers could...

Quantization for Neural Networks - Lei Mao's Log Book

It is common to do layer fusions for some combinations of neural network layers, such as Conv2D-ReLU and Conv2D-BatchNorm-ReLU . This not only ......