CONV+BN+ReLU doesn't get merged with custom quantization
See original GitHub issuePrior to filing: check that this should be a bug instead of a feature request. Everything supported, including the compatible versions of TensorFlow, is listed in the overview page of each technique. For example, the overview page of quantization-aware training is here. An issue for anything not supported should be a feature request.
Describe the bug I am trying to perform custom quantization in a n/w with ConV+BN+ReLU pattern. Now, these layers are not getting merged as is the case with default quantization. As it turns out, this is the expected behavior as specified in this piece of code:
Current implementation https://github.com/tensorflow/model-optimization/blob/master/tensorflow_model_optimization/python/core/quantization/keras/default_8bit/default_8bit_transforms.py
class Conv2DBatchNormReLUQuantize(Conv2DBatchNormQuantize):
"""Ensure FQ does not get placed between Conv, BatchNorm and ReLU."""
def pattern(self):
return LayerPattern(
# TODO(pulkitb): Enhance match to only occur for relu, relu1 and relu6
'ReLU',
inputs=[super(Conv2DBatchNormReLUQuantize, self).pattern()])
def _replace(self, relu_layer_node, bn_layer_node, conv_layer_node):
if _has_custom_quantize_config(
relu_layer_node, bn_layer_node, conv_layer_node):
return relu_layer_node
conv_layer_node.layer['config']['activation'] = \
keras.activations.serialize(quantize_aware_activation.NoOpActivation())
bn_layer_node.metadata['quantize_config'] = \
default_8bit_quantize_configs.NoOpQuantizeConfig()
return relu_layer_node
Here, we are checking if any of the layer has a custom quantization and based on that we change the behavior of the Conv and BN layer. This does not seem to be the correct behavior as per my expectation. Since I am not specifying anywhere that I don’t want them to be merged, these layers must be merged by default
Expected Implementation
class Conv2DBatchNormReLUQuantize(Conv2DBatchNormQuantize):
"""Ensure FQ does not get placed between Conv, BatchNorm and ReLU."""
def pattern(self):
return LayerPattern(
# TODO(pulkitb): Enhance match to only occur for relu, relu1 and relu6
'ReLU',
inputs=[super(Conv2DBatchNormReLUQuantize, self).pattern()])
def _replace(self, relu_layer_node, bn_layer_node, conv_layer_node):
#if _has_custom_quantize_config(
# relu_layer_node, bn_layer_node, conv_layer_node):
# return relu_layer_node
conv_layer_node.layer['config']['activation'] = \
keras.activations.serialize(quantize_aware_activation.NoOpActivation())
bn_layer_node.metadata['quantize_config'] = \
default_8bit_quantize_configs.NoOpQuantizeConfig()
return relu_layer_node
System information Linux
TensorFlow version (installed from source or binary):2.2
TensorFlow Model Optimization version (installed from source or binary):0.3
Python version:3.7
Issue Analytics
- State:
- Created 3 years ago
- Reactions:1
- Comments:13 (1 by maintainers)

Top Related StackOverflow Question
Hi @debapriyamaji Those lines have indeed been executed. I am using TFMOT 0.4.0 . This is my quantization config:
@jackjhliu I’m not sure if you have try anything like this.
But the problem stem from the fact that BatchNorm doesn’t get folded after QAT. I believe Batchnorm were per-tensor support, while Conv2d were per -channel (TF 2.0 default). Hence, when you add the configuration for Conv2D to become per-tensor QAT, it couldn’t get folded properly.
The solution I’m opt for might not be the best, but it has seemed to work quite well for me and my team.
I recommend you folded the Batchnorm layer before QAT, then quantize the model. The accuracy might be lost a little bit, but this way I’m able to avoid the issue of the Batchnorm laying still exists.
Even with this way, there are still issues, for example, after QAT and converting to TFLite, we’ll see multiple Quantized and Dequantized nodes. These you can remove them and you’re left with a model fully per-tensor QAT that still have decent accuracy.
Unfortunately the extra step, you need to modify the
tflitefile a little bit, I’m not familiar with this process (I only work on the quantize while my team helped removed these layers). But it is possible to remove it by editing the tflite file