question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Low int8 accuracy after fine tuning BERT-XNLI

See original GitHub issue

Hi, I followed the instruction in https://github.com/openvinotoolkit/nncf/tree/develop/third_party_integration/huggingface_transformers#bert-xnli to QAT and evaluate int8 BERT-XNLI. The README says the int8 accuracy should be 77.22%, but my result is far worse than that: it is only 33% as shown in the output below.

Int8 accuracy:

***** eval metrics *****
  epoch                   =        4.0
  eval_accuracy           =     0.3333
  eval_loss               =     1.0988
  eval_runtime            = 0:06:18.44
  eval_samples            =       2490
  eval_samples_per_second =       6.58
  eval_steps_per_second   =       6.58

In contrast, the float model, trained with fp16 precision, gives good accuracy:

FP16 accuracy:

***** eval metrics *****
  epoch                   =        4.0
  eval_accuracy           =     0.7614
  eval_loss               =     0.7525
  eval_runtime            = 0:00:15.91
  eval_samples            =       2490
  eval_samples_per_second =    156.443
  eval_steps_per_second   =    156.443

I’m wondering what is wrong in my setup. The only change I made from the instruction is to reduce the batch size from 48 to 24.

My PyTorch version is 1.9.1, my local nncf install is at commit https://github.com/openvinotoolkit/nncf/commit/1bd52822dec121a7c7ffaab7364e21232dd07ef4, and transformers is at commit bff1c71e84e392af9625c345f9ea71f7b6d75fb3 as specified in the README.

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:8 (4 by maintainers)

github_iconTop GitHub Comments

1reaction
vshamporcommented, Oct 11, 2021

Hope you were able to reach your goals, @masahi! Note that since training is somewhat non-deterministic, the number of epochs and the learning rate may vary to reach the target 77.22% accuracy - sometimes the accuracy is reached for the 1st or 2nd epoch checkpoint, and other times it takes all 4. If you are unable to reach 77.22% or a similar value after multiple hyperparam search attempts, though, feel free to open another issue and we will look into it.

0reactions
masahicommented, Oct 11, 2021

Hi @vshampor, I tried again with PT 1.9.1 and this time, the accuracy looks good. I don’t know what I was doing when I hit the 33% accuracy, but all is well now. Thank you and sorry for your trouble.

***** eval metrics *****
  epoch                   =        4.0
  eval_accuracy           =     0.7422
  eval_loss               =     0.6483
  eval_runtime            = 0:06:17.94
  eval_samples            =       2490
  eval_samples_per_second =      6.588
  eval_steps_per_second   =      6.588
Read more comments on GitHub >

github_iconTop Results From Across the Web

On the Stability of Fine-tuning BERT - OpenReview
This paper identifies the causal factors behind a major known issue in deep learning for NLP: Fine-tuning models on small datasets after self-supervised ......
Read more >
Fine-tuning Pre-trained BERT Models - GluonNLP
In real production, there are two main benefits of lower precision (INT8). First, the computation can be accelerated by the low precision instruction,...
Read more >
Fine-tuning BERT for Natural Language Inference
Train set accuracy for the tuned model is 88.8% which is very close to (not-tuned) model trained with full training set. accuracy(tuned_model, ...
Read more >
Intel/bert-base-uncased-finetuned-swag-int8-static
This is an INT8 PyTorch model quantized with huggingface/optimum-intel through the usage of Intel® Neural Compressor. The original fp32 model ...
Read more >
16.7. Natural Language Inference: Fine-Tuning BERT
In this section, we will download a pretrained small version of BERT, then fine-tune it for natural language inference on the SNLI dataset....
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found