question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

CONV+BN+ReLU doesn't get merged with custom quantization

See original GitHub issue

Prior to filing: check that this should be a bug instead of a feature request. Everything supported, including the compatible versions of TensorFlow, is listed in the overview page of each technique. For example, the overview page of quantization-aware training is here. An issue for anything not supported should be a feature request.

Describe the bug I am trying to perform custom quantization in a n/w with ConV+BN+ReLU pattern. Now, these layers are not getting merged as is the case with default quantization. As it turns out, this is the expected behavior as specified in this piece of code:

Current implementation https://github.com/tensorflow/model-optimization/blob/master/tensorflow_model_optimization/python/core/quantization/keras/default_8bit/default_8bit_transforms.py

class Conv2DBatchNormReLUQuantize(Conv2DBatchNormQuantize):
  """Ensure FQ does not get placed between Conv, BatchNorm and ReLU."""

  def pattern(self):
    return LayerPattern(
        # TODO(pulkitb): Enhance match to only occur for relu, relu1 and relu6
        'ReLU',
        inputs=[super(Conv2DBatchNormReLUQuantize, self).pattern()])

  def _replace(self, relu_layer_node, bn_layer_node, conv_layer_node):
    if _has_custom_quantize_config(
        relu_layer_node, bn_layer_node, conv_layer_node):
      return relu_layer_node

    conv_layer_node.layer['config']['activation'] = \
      keras.activations.serialize(quantize_aware_activation.NoOpActivation())
    bn_layer_node.metadata['quantize_config'] = \
      default_8bit_quantize_configs.NoOpQuantizeConfig()

    return relu_layer_node

Here, we are checking if any of the layer has a custom quantization and based on that we change the behavior of the Conv and BN layer. This does not seem to be the correct behavior as per my expectation. Since I am not specifying anywhere that I don’t want them to be merged, these layers must be merged by default

Expected Implementation

class Conv2DBatchNormReLUQuantize(Conv2DBatchNormQuantize):
  """Ensure FQ does not get placed between Conv, BatchNorm and ReLU."""

  def pattern(self):
    return LayerPattern(
        # TODO(pulkitb): Enhance match to only occur for relu, relu1 and relu6
        'ReLU',
        inputs=[super(Conv2DBatchNormReLUQuantize, self).pattern()])

  def _replace(self, relu_layer_node, bn_layer_node, conv_layer_node):
    #if _has_custom_quantize_config(
    #    relu_layer_node, bn_layer_node, conv_layer_node):
     # return relu_layer_node

    conv_layer_node.layer['config']['activation'] = \
      keras.activations.serialize(quantize_aware_activation.NoOpActivation())
    bn_layer_node.metadata['quantize_config'] = \
      default_8bit_quantize_configs.NoOpQuantizeConfig()

    return relu_layer_node

System information Linux

TensorFlow version (installed from source or binary):2.2

TensorFlow Model Optimization version (installed from source or binary):0.3

Python version:3.7

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Reactions:1
  • Comments:13 (1 by maintainers)

github_iconTop GitHub Comments

2reactions
biyomlcommented, Sep 25, 2020

Hi @debapriyamaji Those lines have indeed been executed. I am using TFMOT 0.4.0 . This is my quantization config:

class AsymPerLayerQuantizeConfig(tfmot.quantization.keras.QuantizeConfig):
    # Configure how to quantize weights.
    def get_weights_and_quantizers(self, layer):
        quantizer = LastValueQuantizer(num_bits=8, symmetric=False, narrow_range=False, per_axis=False)
        if hasattr(layer, 'kernel'):
            return [(layer.kernel, quantizer)]
        else:
            return [(layer.depthwise_kernel, quantizer)]

    # Configure how to quantize activations.
    def get_activations_and_quantizers(self, layer):
        return [(layer.activation, MovingAverageQuantizer(num_bits=8, symmetric=False, narrow_range=False, per_axis=False))]

    def set_quantize_weights(self, layer, quantize_weights):
        # Add this line for each item returned in `get_weights_and_quantizers`
        # , in the same order
        if hasattr(layer, 'kernel'):
            layer.kernel = quantize_weights[0]
        else:
            layer.depthwise_kernel = quantize_weights[0]

    def set_quantize_activations(self, layer, quantize_activations):
        # Add this line for each item returned in `get_activations_and_quantizers`
        # , in the same order.
        layer.activation = quantize_activations[0]

    # Configure how to quantize outputs (may be equivalent to activations).
    def get_output_quantizers(self, layer):
        return []

    def get_config(self):
        return {}


def apply(model):
    quantize_annotate_layer = tfmot.quantization.keras.quantize_annotate_layer
    quantize_annotate_model = tfmot.quantization.keras.quantize_annotate_model
    quantize_scope = tfmot.quantization.keras.quantize_scope

    def clone_fn(layer):
        if type(layer) in [layers.Conv2D, layers.DepthwiseConv2D, layers.Dense]:
            print (layer.name)
            return quantize_annotate_layer(layer, quantize_config=AsymPerLayerQuantizeConfig())
        return quantize_annotate_layer(layer)

    model = quantize_annotate_model(tf.keras.models.clone_model(model, clone_function=clone_fn))

    with quantize_scope({
        'AsymPerLayerQuantizeConfig': AsymPerLayerQuantizeConfig}):
             # Use `quantize_apply` to actually make the model quantization aware.
             quant_aware_model = tfmot.quantization.keras.quantize_apply(model)

             return quant_aware_model
1reaction
LLNLanLeNcommented, Sep 30, 2020

@jackjhliu I’m not sure if you have try anything like this.

But the problem stem from the fact that BatchNorm doesn’t get folded after QAT. I believe Batchnorm were per-tensor support, while Conv2d were per -channel (TF 2.0 default). Hence, when you add the configuration for Conv2D to become per-tensor QAT, it couldn’t get folded properly.

The solution I’m opt for might not be the best, but it has seemed to work quite well for me and my team.

I recommend you folded the Batchnorm layer before QAT, then quantize the model. The accuracy might be lost a little bit, but this way I’m able to avoid the issue of the Batchnorm laying still exists.

Even with this way, there are still issues, for example, after QAT and converting to TFLite, we’ll see multiple Quantized and Dequantized nodes. These you can remove them and you’re left with a model fully per-tensor QAT that still have decent accuracy.

Unfortunately the extra step, you need to modify the tflite file a little bit, I’m not familiar with this process (I only work on the quantize while my team helped removed these layers). But it is possible to remove it by editing the tflite file

Read more comments on GitHub >

github_iconTop Results From Across the Web

RuntimeError: Could not run 'aten::thnn_conv2d_forward' with ...
The configuration file and datasets I used are customized, ... Conv2d(1, 1, 1) # this module will not be quantized (see `qconfig =...
Read more >
(beta) Static Quantization with Eager Mode in PyTorch
We first define the MobileNetV2 model architecture, with several notable modifications to enable quantization: ... Note: this code is taken from here. from...
Read more >
A Mixed Quantization Network for ... - BMVC 2021
Abstract. Recovering a high dynamic range (HDR) image from a single low dynamic range. (LDR) image, namely inverse tone mapping (ITM), is challenging...
Read more >
Dynamic Onloading of Deep Neural Networks from Cloud to ...
DynO dynamically adapts to changes in device capabilities and load as well as the networking conditions, achieving up to 7.9× higher throughput ...
Read more >
A Mixed Quantization Network for Computationally Efficient ...
To this end, we propose combining efficient operations of deep neural networks with a novel mixed quantization scheme to construct a well- ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found