Error: Expected shape from model of {} does not match actual shape of {1,1,1} for output
See original GitHub issueProblem
I’m getting the following error when I’m trying to apply static quantization (ONNX) with the ORTQuantizer
.
Tests
This error occurs for:
- my custom script
- the example code in the README.md
- The example [notebook] in this repository (https://github.com/huggingface/notebooks/blob/master/examples/text_classification_quantization_ort.ipynb)
- a brand new project with only
transformers
,datasets
andoptimum[onnxruntime]
installed - a brand new project with only
transformers
,datasets
andoptimum[onnxruntime]
(withpython -m pip install git+https://github.com/huggingface/optimum.git
installed)
More
- The resulting
model-quantized.onnx
can be loaded but produces very bad results. - dynamic quantization works seamlessy
- using:
- Python 3.9
- tested on two different devices with different operating systems:
- MacOS Monterey (with Intel)
- WSL for Windows 11 (Ubuntu)
Issue Analytics
- State:
- Created 2 years ago
- Reactions:6
- Comments:6 (1 by maintainers)
Top Results From Across the Web
Possible issue with shape inference while statically quantizing ...
Error : Expected shape from model of {} does not match actual shape of {1,1,1} for output huggingface/optimum#85.
Read more >Sizes of tensors must match except in dimension 1 - tensorboard
I am trying to train GCN model on my custom dataset and I have resized all the values but I am getting error:...
Read more >Keras load pre-trained weights. Shape mismatch
While it throws another throwing "ValueError: Shapes (1536, ... shapes. I guess the lesson to learn is that the model has to match...
Read more >Error in loading ONNX model with ONNXRuntime
Is anyone familiar with this error? Environments: Python 3.7 Pytorch 1.9.0 CUDA 10.2 ONNX 1.10.1 ONNXRuntime 1.8.1 OS Ubuntu 18.04.
Read more >Some frequent errors — SciPyTutorial 0.0.4 documentation
where we have to provide the shape as a python tuple (as always). There are similar pitfalls with ones, array and eye. The...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Hi @realjanpaulus,
Thanks for sharing your experiments results, analyzing which part of the model is sensitive to quantization is a very interesting topic.
I would tend to say that in general the more calibration data you provide, the more confident we can be in the estimated quantization parameters. This is however not true for calibration methods such as minmax which takes the global minimum and maximum values (there is currently not the option to compute those values using an exponential moving average). This results in an increase of the quantization range, leading to a decrease in precision and very likely a drop in the final model’s performance. Even though this should not be true for every model / task / calibration method combination, I found that for BERT models on text classification tasks, 40 to 50 examples were giving good results when using the minmax calibration method.
I hope this helps !
I will close this issue as the initial problem is now solved, if you have other questions please feel free to open an other one.
Hi @realjanpaulus,
We reported the issue to ORT folks and should be fixed in next release 👍🏻.
In the meantime I confirm it doesn’t impact final performances of the model:
↪️ https://github.com/microsoft/onnxruntime/issues/10504
Regarding the second, currently histogram based methods are failling for some parameters combinaisons, it should be fixed also in the next release, PR has been merged upstream:
➡️ https://github.com/microsoft/onnxruntime/issues/10571