Unable to see the weight files after quantization
See original GitHub issueSystem Info
transformers
version: 4.23.1- Platform: Linux-5.10.133±x86_64-with-Ubuntu-18.04-bionic
- Python version: 3.7.15
- Huggingface_hub version: 0.10.1
- PyTorch version (GPU?): 1.12.1+cu113 (False)
- Tensorflow version (GPU?): 2.9.2 (False)
- Flax version (CPU?/GPU?/TPU?): not installed (NA)
- Jax version: not installed
- JaxLib version: not installed
- Using GPU in script?: No
- Using distributed or parallel set-up in script?: No
Who can help?
No response
Information
- The official example scripts
- My own modified scripts
Tasks
- An officially supported task in the
examples
folder (such as GLUE/SQuAD, …) - My own task or dataset (give details below)
Reproduction
I have tried the following code for dynamic quantization
import torch
import os
from transformers import AutoConfig, AutoModel
model = AutoModel.from_pretrained("bert-base-uncased")
model_quantized = torch.quantization.quantize_dynamic(model, {torch.nn.Linear}, dtype=torch.qint8)
quantized_output_dir = "quantized/"
if not os.path.exists(quantized_output_dir):
os.makedirs(quantized_output_dir)
model_quantized.save_pretrained(quantized_output_dir)
After the execution, I could see that there is a new folder named quantized created in the directory which contains only the config.json
file.
contents are as follows
{
"_name_or_path": "bert-base-uncased",
"architectures": [
"BertModel"
],
"attention_probs_dropout_prob": 0.1,
"classifier_dropout": null,
"gradient_checkpointing": false,
"hidden_act": "gelu",
"hidden_dropout_prob": 0.1,
"hidden_size": 768,
"initializer_range": 0.02,
"intermediate_size": 3072,
"layer_norm_eps": 1e-12,
"max_position_embeddings": 512,
"model_type": "bert",
"num_attention_heads": 12,
"num_hidden_layers": 12,
"pad_token_id": 0,
"position_embedding_type": "absolute",
"torch_dtype": "float32",
"transformers_version": "4.23.1",
"type_vocab_size": 2,
"use_cache": true,
"vocab_size": 30522
}
I can’t see any other .bin or .wt files after quantization. Why it is so?
Expected behavior
The model should be quantized and save the new quantized weight files in the provided folder along with the config.json file
Issue Analytics
- State:
- Created a year ago
- Comments:8 (4 by maintainers)
Top Results From Across the Web
Quantized weights cant be loaded, when tried, generate ...
The error means that input to quantize_per_tensor is a quantized tensor, which is unsupported. The device needs to be CPU for quantize-per- ...
Read more >How can I view weights in a .tflite file? - Stack Overflow
The output will be a folder called tflite . Then you can load the model and get the tensor you want. Tensor has...
Read more >Quantization aware training comprehensive guide - TensorFlow
Welcome to the comprehensive guide for Keras quantization aware training. This page documents various use cases and shows how to use the API ......
Read more >Quantizing a Model - Qualcomm Developer Network
After getting new raw files and updating image_file_list.txt, you should be able to quantize the dlc successfully. 80-NL315-14 A MAY CONTAIN U.S. AND ......
Read more >NVIDIA Deep Learning TensorRT Documentation
When quantizing a floating-point tensor, TensorRT must know its dynamic range - that is, ... you can specify that it may later have...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
@michaelbenayoun Thank you very much for the comments.
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.