Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

CONV+BN+ReLU doesn't get merged with custom quantization

See original GitHub issue

Prior to filing: check that this should be a bug instead of a feature request. Everything supported, including the compatible versions of TensorFlow, is listed in the overview page of each technique. For example, the overview page of quantization-aware training is here. An issue for anything not supported should be a feature request.

Describe the bug I am trying to perform custom quantization in a n/w with ConV+BN+ReLU pattern. Now, these layers are not getting merged as is the case with default quantization. As it turns out, this is the expected behavior as specified in this piece of code:

Current implementation https://github.com/tensorflow/model-optimization/blob/master/tensorflow_model_optimization/python/core/quantization/keras/default_8bit/default_8bit_transforms.py

class Conv2DBatchNormReLUQuantize(Conv2DBatchNormQuantize):
  """Ensure FQ does not get placed between Conv, BatchNorm and ReLU."""

  def pattern(self):
    return LayerPattern(
        # TODO(pulkitb): Enhance match to only occur for relu, relu1 and relu6
        'ReLU',
        inputs=[super(Conv2DBatchNormReLUQuantize, self).pattern()])

  def _replace(self, relu_layer_node, bn_layer_node, conv_layer_node):
    if _has_custom_quantize_config(
        relu_layer_node, bn_layer_node, conv_layer_node):
      return relu_layer_node

    conv_layer_node.layer['config']['activation'] = \
      keras.activations.serialize(quantize_aware_activation.NoOpActivation())
    bn_layer_node.metadata['quantize_config'] = \
      default_8bit_quantize_configs.NoOpQuantizeConfig()

    return relu_layer_node

Here, we are checking if any of the layer has a custom quantization and based on that we change the behavior of the Conv and BN layer. This does not seem to be the correct behavior as per my expectation. Since I am not specifying anywhere that I don’t want them to be merged, these layers must be merged by default

Expected Implementation

class Conv2DBatchNormReLUQuantize(Conv2DBatchNormQuantize):
  """Ensure FQ does not get placed between Conv, BatchNorm and ReLU."""

  def pattern(self):
    return LayerPattern(
        # TODO(pulkitb): Enhance match to only occur for relu, relu1 and relu6
        'ReLU',
        inputs=[super(Conv2DBatchNormReLUQuantize, self).pattern()])

  def _replace(self, relu_layer_node, bn_layer_node, conv_layer_node):
    #if _has_custom_quantize_config(
    #    relu_layer_node, bn_layer_node, conv_layer_node):
     # return relu_layer_node

    conv_layer_node.layer['config']['activation'] = \
      keras.activations.serialize(quantize_aware_activation.NoOpActivation())
    bn_layer_node.metadata['quantize_config'] = \
      default_8bit_quantize_configs.NoOpQuantizeConfig()

    return relu_layer_node

System information Linux

TensorFlow version (installed from source or binary):2.2

TensorFlow Model Optimization version (installed from source or binary):0.3

Python version:3.7

Issue Analytics

State:
Created 3 years ago
Reactions:1
Comments:13 (1 by maintainers)

Top GitHub Comments

2reactions

biyomlcommented, Sep 25, 2020

Hi @debapriyamaji Those lines have indeed been executed. I am using TFMOT 0.4.0 . This is my quantization config:

class AsymPerLayerQuantizeConfig(tfmot.quantization.keras.QuantizeConfig):
    # Configure how to quantize weights.
    def get_weights_and_quantizers(self, layer):
        quantizer = LastValueQuantizer(num_bits=8, symmetric=False, narrow_range=False, per_axis=False)
        if hasattr(layer, 'kernel'):
            return [(layer.kernel, quantizer)]
        else:
            return [(layer.depthwise_kernel, quantizer)]

    # Configure how to quantize activations.
    def get_activations_and_quantizers(self, layer):
        return [(layer.activation, MovingAverageQuantizer(num_bits=8, symmetric=False, narrow_range=False, per_axis=False))]

    def set_quantize_weights(self, layer, quantize_weights):
        # Add this line for each item returned in `get_weights_and_quantizers`
        # , in the same order
        if hasattr(layer, 'kernel'):
            layer.kernel = quantize_weights[0]
        else:
            layer.depthwise_kernel = quantize_weights[0]

    def set_quantize_activations(self, layer, quantize_activations):
        # Add this line for each item returned in `get_activations_and_quantizers`
        # , in the same order.
        layer.activation = quantize_activations[0]

    # Configure how to quantize outputs (may be equivalent to activations).
    def get_output_quantizers(self, layer):
        return []

    def get_config(self):
        return {}


def apply(model):
    quantize_annotate_layer = tfmot.quantization.keras.quantize_annotate_layer
    quantize_annotate_model = tfmot.quantization.keras.quantize_annotate_model
    quantize_scope = tfmot.quantization.keras.quantize_scope

    def clone_fn(layer):
        if type(layer) in [layers.Conv2D, layers.DepthwiseConv2D, layers.Dense]:
            print (layer.name)
            return quantize_annotate_layer(layer, quantize_config=AsymPerLayerQuantizeConfig())
        return quantize_annotate_layer(layer)

    model = quantize_annotate_model(tf.keras.models.clone_model(model, clone_function=clone_fn))

    with quantize_scope({
        'AsymPerLayerQuantizeConfig': AsymPerLayerQuantizeConfig}):
             # Use `quantize_apply` to actually make the model quantization aware.
             quant_aware_model = tfmot.quantization.keras.quantize_apply(model)

             return quant_aware_model

1reaction

LLNLanLeNcommented, Sep 30, 2020

@jackjhliu I’m not sure if you have try anything like this.

But the problem stem from the fact that BatchNorm doesn’t get folded after QAT. I believe Batchnorm were per-tensor support, while Conv2d were per -channel (TF 2.0 default). Hence, when you add the configuration for Conv2D to become per-tensor QAT, it couldn’t get folded properly.

The solution I’m opt for might not be the best, but it has seemed to work quite well for me and my team.

I recommend you folded the Batchnorm layer before QAT, then quantize the model. The accuracy might be lost a little bit, but this way I’m able to avoid the issue of the Batchnorm laying still exists.

Even with this way, there are still issues, for example, after QAT and converting to TFLite, we’ll see multiple Quantized and Dequantized nodes. These you can remove them and you’re left with a model fully per-tensor QAT that still have decent accuracy.

Unfortunately the extra step, you need to modify the tflite file a little bit, I’m not familiar with this process (I only work on the quantize while my team helped removed these layers). But it is possible to remove it by editing the tflite file

Top Results From Across the Web

RuntimeError: Could not run 'aten::thnn_conv2d_forward' with ...

The configuration file and datasets I used are customized, ... Conv2d(1, 1, 1) # this module will not be quantized (see `qconfig =...

(beta) Static Quantization with Eager Mode in PyTorch

We first define the MobileNetV2 model architecture, with several notable modifications to enable quantization: ... Note: this code is taken from here. from...

A Mixed Quantization Network for ... - BMVC 2021

Abstract. Recovering a high dynamic range (HDR) image from a single low dynamic range. (LDR) image, namely inverse tone mapping (ITM), is challenging...

Dynamic Onloading of Deep Neural Networks from Cloud to ...

DynO dynamically adapts to changes in device capabilities and load as well as the networking conditions, achieving up to 7.9× higher throughput ...

A Mixed Quantization Network for Computationally Efficient ...

To this end, we propose combining efficient operations of deep neural networks with a novel mixed quantization scheme to construct a well- ...