Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Unable to see the weight files after quantization

See original GitHub issue

System Info

transformers version: 4.23.1
Platform: Linux-5.10.133±x86_64-with-Ubuntu-18.04-bionic
Python version: 3.7.15
Huggingface_hub version: 0.10.1
PyTorch version (GPU?): 1.12.1+cu113 (False)
Tensorflow version (GPU?): 2.9.2 (False)
Flax version (CPU?/GPU?/TPU?): not installed (NA)
Jax version: not installed
JaxLib version: not installed
Using GPU in script?: No
Using distributed or parallel set-up in script?: No

Who can help?

No response

Information

The official example scripts
My own modified scripts

Tasks

An officially supported task in the examples folder (such as GLUE/SQuAD, …)
My own task or dataset (give details below)

Reproduction

I have tried the following code for dynamic quantization

import torch
import os
from transformers import AutoConfig, AutoModel
model = AutoModel.from_pretrained("bert-base-uncased")
model_quantized = torch.quantization.quantize_dynamic(model, {torch.nn.Linear}, dtype=torch.qint8)  
quantized_output_dir = "quantized/"
if not os.path.exists(quantized_output_dir):
    os.makedirs(quantized_output_dir)
    model_quantized.save_pretrained(quantized_output_dir)

After the execution, I could see that there is a new folder named quantized created in the directory which contains only the config.json file. contents are as follows

{
  "_name_or_path": "bert-base-uncased",
  "architectures": [
    "BertModel"
  ],
  "attention_probs_dropout_prob": 0.1,
  "classifier_dropout": null,
  "gradient_checkpointing": false,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 768,
  "initializer_range": 0.02,
  "intermediate_size": 3072,
  "layer_norm_eps": 1e-12,
  "max_position_embeddings": 512,
  "model_type": "bert",
  "num_attention_heads": 12,
  "num_hidden_layers": 12,
  "pad_token_id": 0,
  "position_embedding_type": "absolute",
  "torch_dtype": "float32",
  "transformers_version": "4.23.1",
  "type_vocab_size": 2,
  "use_cache": true,
  "vocab_size": 30522
}

I can’t see any other .bin or .wt files after quantization. Why it is so?

Expected behavior

The model should be quantized and save the new quantized weight files in the provided folder along with the config.json file

Issue Analytics

State:
Created a year ago
Comments:8 (4 by maintainers)

Top GitHub Comments

1reaction

pradeepdev-1995commented, Oct 28, 2022

@michaelbenayoun Thank you very much for the comments.

0reactions

github-actions[bot]commented, Nov 26, 2022

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

Top Results From Across the Web

Quantized weights cant be loaded, when tried, generate ...

The error means that input to quantize_per_tensor is a quantized tensor, which is unsupported. The device needs to be CPU for quantize-per- ...

How can I view weights in a .tflite file? - Stack Overflow

The output will be a folder called tflite . Then you can load the model and get the tensor you want. Tensor has...

Quantization aware training comprehensive guide - TensorFlow

Welcome to the comprehensive guide for Keras quantization aware training. This page documents various use cases and shows how to use the API ......

Quantizing a Model - Qualcomm Developer Network

After getting new raw files and updating image_file_list.txt, you should be able to quantize the dlc successfully. 80-NL315-14 A MAY CONTAIN U.S. AND ......

NVIDIA Deep Learning TensorRT Documentation

When quantizing a floating-point tensor, TensorRT must know its dynamic range - that is, ... you can specify that it may later have...