Low int8 accuracy after fine tuning BERT-XNLI
See original GitHub issueHi, I followed the instruction in https://github.com/openvinotoolkit/nncf/tree/develop/third_party_integration/huggingface_transformers#bert-xnli to QAT and evaluate int8 BERT-XNLI. The README says the int8 accuracy should be 77.22%, but my result is far worse than that: it is only 33% as shown in the output below.
Int8 accuracy:
***** eval metrics *****
epoch = 4.0
eval_accuracy = 0.3333
eval_loss = 1.0988
eval_runtime = 0:06:18.44
eval_samples = 2490
eval_samples_per_second = 6.58
eval_steps_per_second = 6.58
In contrast, the float model, trained with fp16 precision, gives good accuracy:
FP16 accuracy:
***** eval metrics *****
epoch = 4.0
eval_accuracy = 0.7614
eval_loss = 0.7525
eval_runtime = 0:00:15.91
eval_samples = 2490
eval_samples_per_second = 156.443
eval_steps_per_second = 156.443
I’m wondering what is wrong in my setup. The only change I made from the instruction is to reduce the batch size from 48 to 24.
My PyTorch version is 1.9.1, my local nncf install is at commit https://github.com/openvinotoolkit/nncf/commit/1bd52822dec121a7c7ffaab7364e21232dd07ef4, and transformers
is at commit bff1c71e84e392af9625c345f9ea71f7b6d75fb3
as specified in the README.
Issue Analytics
- State:
- Created 2 years ago
- Comments:8 (4 by maintainers)
Top GitHub Comments
Hope you were able to reach your goals, @masahi! Note that since training is somewhat non-deterministic, the number of epochs and the learning rate may vary to reach the target 77.22% accuracy - sometimes the accuracy is reached for the 1st or 2nd epoch checkpoint, and other times it takes all 4. If you are unable to reach 77.22% or a similar value after multiple hyperparam search attempts, though, feel free to open another issue and we will look into it.
Hi @vshampor, I tried again with PT 1.9.1 and this time, the accuracy looks good. I don’t know what I was doing when I hit the 33% accuracy, but all is well now. Thank you and sorry for your trouble.