Can not quantize model in Per tensor way
See original GitHub issueHello teams,
I try to quantize all the parameters of my model in a per_tensor
way.
And I found that the final output quantization model still contains layers per_channel
.
the yaml
file is following:
version: 1.0
model: # mandatory. used to specify model specific information.
name: mobilenetv2
framework: onnxrt_qlinearops # mandatory. supported values are tensorflow, pytorch, pytorch_ipex, onnxrt_integer, onnxrt_qlinear or mxnet; allow new framework backend extension.
quantization: # optional. tuning constraints on model-wise for advance user to reduce tuning space.
approach: post_training_static_quant # optional. default value is post_training_static_quant.
calibration:
dataloader:
batch_size: 1
dataset:
ImagenetRaw:
data_path: /home/tau/Workspace/databank/imagenet/ILSVRC/Data/CLS-LOC/val
image_list: /home/tau/Workspace/databank/imagenet/caffe_labels/val.txt # download from http://dl.caffe.berkeleyvision.org/caffe_ilsvrc12.tar.gz
transform:
Rescale: {}
Resize:
size: 256
CenterCrop:
size: 224
Normalize:
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
Transpose:
perm: [2, 0, 1]
Cast:
dtype: float32
model_wise: # optional. tuning constraints on model-wise for advance user to reduce tuning space.
weight:
granularity: per_tensor
scheme: asym
dtype: int8
algorithm: minmax
activation:
granularity: per_tensor
scheme: asym
algorithm: minmax
tuning:
accuracy_criterion:
relative: 0.02 # optional. default value is relative, other value is absolute. this example allows relative accuracy loss: 1%.
exit_policy:
timeout: 0 # optional. tuning timeout (seconds). default value is 0 which means early stop. combine with max_trials field to decide when to exit.
random_seed: 9527 # optional. random seed for deterministic tuning.
Thanks.
Issue Analytics
- State:
- Created a year ago
- Comments:7
Top Results From Across the Web
Cannot quantize part of a model · Issue #46073 - GitHub
It is possible to quantize per-layer using quantized aware training as you mention. You can use the quantize_annotate_layer as mentioned for ...
Read more >Post-training quantization | TensorFlow Lite
You can quantize an already-trained float TensorFlow model when you convert it to TensorFlow Lite format using the TensorFlow Lite Converter.
Read more >Quantization — PyTorch 1.13 documentation
At lower level, PyTorch provides a way to represent quantized tensors and perform operations with them. They can be used to directly construct...
Read more >Quantization - Neural Network Distiller
In many cases, taking a model trained for FP32 and directly quantizing it to INT8, without any re-training, can result in a relatively...
Read more >Practical tips for better quantization results - Heartbeat
2. Set the qconfig for only those layers you want to quantize, not the whole model. For instance, instead of model.qconfig , ...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Hi, 1.12 version support per-tensor way. If you want to get per-tensor quantized model directly, pls add model_wise in yaml file like https://github.com/intel/neural-compressor/blob/aac0a0ec860d6d875467a8b7fb119ec18713fd48/neural_compressor/template/ptq.yaml#L43 and set ‘granularity’ to per_tensor
Thanks @mengniwang95, this will be of great help to us.