question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Quantization Not Fusing Pad to Conv2D

See original GitHub issue

Recently, I quantized a pre-trained ResNet50 model from fp32 to int8, and I noticed that the performance isn’t what I expected. The performance is only about 2x compared to the equivalent fp32 model. Investigating further, I noticed in Netron that INC didn’t seem to fuse Pad with any later operations.

I’ve got two int8 models:

  1. Model A: Model that I downloaded from the Intel Model Zoo (https://github.com/IntelAI/models/blob/master/benchmarks/image_recognition/tensorflow/resnet50/inference/int8/README.md).
  2. Model B: Model that I quantized using INC, using the example instructions (https://github.com/intel/neural-compressor/tree/master/examples/tensorflow/object_detection/tensorflow_models/quantization/ptq).

Model A is available from the above link, but here’s Model B. GitHub wouldn’t allow me to upload PB files, so I changed the file extension to .zip.

My question is this: does INC support fusing Pad operations, or am I running the scripts incorrectly?

Issue Analytics

  • State:closed
  • Created a year ago
  • Comments:6 (1 by maintainers)

github_iconTop GitHub Comments

1reaction
ftian1commented, Apr 21, 2022

thanks for raising such an issue.

INC supports Pad + conv fusion. The problem is if the pad op’s tensor value is not all 0, it may trigger TensorFlow int8 op shape inference check and break. The fix on TF kernel was developed by Intel TensorFlow team and only merged in 1.15up3 but not TF 2.x. If you try to use Intel TensorFlow 1.15 up3 version, you can see the pb having fused ops.

pls note INC is for broad model but not specific optimization

0reactions
lvliang-intelcommented, Dec 10, 2022

@matthew-olson-intel The PR https://github.com/tensorflow/tensorflow/pull/53480 is related with Pad + Conv3D. It’s not for Pad + Conv2D. Good news is that this issue has already been fixed by TF SPR-Base and will upstream to stock TF soon. You can try the quantization with SPR-Base as aligned in Teams. Please let’s know if you have any issue.

Read more comments on GitHub >

github_iconTop Results From Across the Web

TF Lite quantized conv2d operator conversion - #20 by jackwish
In TFLite's quantization model, the pad value should be input_zero_point, not 0. However, current normal conv2d doesn't have one parameter ...
Read more >
QAT conversion RuntimeError: Quantization not yet supported ...
The issue is that the model has 2 consecutive quantize at the beginning and 2 consecutive dequantize at the end (not sure why)...
Read more >
Could not run 'quantized::conv2d.new' with arguments from ...
It means that you cannot pass fp32 to your quantized model but would have to quantize the input. If you add QuantStub called...
Read more >
TensorFlow Lite 8-bit quantization specification
The following document outlines the specification for TensorFlow Lite's 8-bit quantization scheme. This is intended to assist hardware ...
Read more >
Operators Supported by PyTorch - 2.5 English - Xilinx
For the output_padding feature, DPU is not supported yet, so, ... pad, paddings, First compiler will try to fuse "CONSTANT" padding into adjacent...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found