The output of IBERT is float32. Am I doing wrong?
See original GitHub issueEnvironment info
transformers
version: 4.5.1- Platform: Linux-5.8.0-49-generic-x86_64-with-glibc2.10
- Python version: 3.8.5
- PyTorch version (GPU?): 1.7.1 (True)
- Tensorflow version (GPU?): not installed (NA)
- Using GPU in script?: Yes
- Using distributed or parallel set-up in script?: DDP (pytorch-lightning)
Who can help
@LysandreJik, @patil-suraj, @patrickvonplaten
Information
I’m trying IBert. The first output of the model is float32
so I’m curious why it happens. I set quant_mode=True
.
The problem arises when using:
- the official example scripts: (give details below)
- my own modified scripts: (give details below)
The tasks I am working on is:
- an official GLUE/SQUaD task: (give the name)
- my own task or dataset: (give details below)
I’m using MSMARCO (IR dataset)
To reproduce
Steps to reproduce the behavior:
- Initialize a model with the command
AutoModel.from_pretrained('kssteven/ibert-roberta-base', quant_mode=True, add_pooling_layer=False)
- Check the
dtype
of the model output.
Expected behavior
The output dtype
should be int8
, but I see float32
Issue Analytics
- State:
- Created 2 years ago
- Comments:8 (4 by maintainers)
Top Results From Across the Web
numpy sum is not giving right answer for float32 type
It is a problem with the accuracy of float32 with a number that big. I haven't run through how 2.25 would be stored,...
Read more >Displaying a Statistical Eye plot using Manual Eyescan
Now I would like to plot the Eyescan results to diplay a statisctial eye. Is there any tool available from Xilinx to do...
Read more >controlling float precision? · Issue #652 · google/jax - GitHub
I am losing some precision with float32 and 64 conversions, and also confused on how to get the level of precision I want....
Read more >Run model with float32s on CPU - Questions - PyMC Discourse
Am I doing something wrong? import numpy import os import time os.environ['THEANO_FLAGS']='device=cpu,floatX=float32,warn_float64=warn,blas.ldflags=-lblas ...
Read more >UltraScale Architecture GTY Transceivers User Guide (UG578)
Added Capturing the Digital Monitor Output through IBERT. ... For UltraScale FPGAs, channels operating above 16.375 Gb/s should not source a ...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
As you have also noticed, all the quant modules including QuantLinear return two tensors:
quant_x
andscaling_factor
. Here,quant_x
/scaling_factor
represents the quantized (integer) value for the activation - in other words,quant_x
is the dequantized value. Therefore, you do not have to multiply it with thescaling_factor
.This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.