Quantization Not Fusing Pad to Conv2D
See original GitHub issueRecently, I quantized a pre-trained ResNet50 model from fp32 to int8, and I noticed that the performance isn’t what I expected. The performance is only about 2x compared to the equivalent fp32 model. Investigating further, I noticed in Netron that INC didn’t seem to fuse Pad with any later operations.
I’ve got two int8 models:
- Model A: Model that I downloaded from the Intel Model Zoo (https://github.com/IntelAI/models/blob/master/benchmarks/image_recognition/tensorflow/resnet50/inference/int8/README.md).
- Model B: Model that I quantized using INC, using the example instructions (https://github.com/intel/neural-compressor/tree/master/examples/tensorflow/object_detection/tensorflow_models/quantization/ptq).
Model A is available from the above link, but here’s Model B. GitHub wouldn’t allow me to upload PB files, so I changed the file extension to .zip
.
My question is this: does INC support fusing Pad operations, or am I running the scripts incorrectly?
Issue Analytics
- State:
- Created a year ago
- Comments:6 (1 by maintainers)
Top Results From Across the Web
TF Lite quantized conv2d operator conversion - #20 by jackwish
In TFLite's quantization model, the pad value should be input_zero_point, not 0. However, current normal conv2d doesn't have one parameter ...
Read more >QAT conversion RuntimeError: Quantization not yet supported ...
The issue is that the model has 2 consecutive quantize at the beginning and 2 consecutive dequantize at the end (not sure why)...
Read more >Could not run 'quantized::conv2d.new' with arguments from ...
It means that you cannot pass fp32 to your quantized model but would have to quantize the input. If you add QuantStub called...
Read more >TensorFlow Lite 8-bit quantization specification
The following document outlines the specification for TensorFlow Lite's 8-bit quantization scheme. This is intended to assist hardware ...
Read more >Operators Supported by PyTorch - 2.5 English - Xilinx
For the output_padding feature, DPU is not supported yet, so, ... pad, paddings, First compiler will try to fuse "CONSTANT" padding into adjacent...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
thanks for raising such an issue.
INC supports Pad + conv fusion. The problem is if the pad op’s tensor value is not all 0, it may trigger TensorFlow int8 op shape inference check and break. The fix on TF kernel was developed by Intel TensorFlow team and only merged in 1.15up3 but not TF 2.x. If you try to use Intel TensorFlow 1.15 up3 version, you can see the pb having fused ops.
pls note INC is for broad model but not specific optimization
@matthew-olson-intel The PR https://github.com/tensorflow/tensorflow/pull/53480 is related with Pad + Conv3D. It’s not for Pad + Conv2D. Good news is that this issue has already been fixed by TF SPR-Base and will upstream to stock TF soon. You can try the quantization with SPR-Base as aligned in Teams. Please let’s know if you have any issue.