Bert (sentence classification) output is non-deterministic for PyTorch (not for TF)
See original GitHub issue🐛 Bug
Information
Model I am using (Bert, XLNet …): Bert
Language I am using the model on (English, Chinese …): German
The problem arises when using:
- the official example scripts: (give details below)
- my own modified scripts: (give details below)
The tasks I am working on is:
- an official GLUE/SQUaD task: (give the name)
- my own task or dataset: (give details below)
To reproduce
Steps to reproduce the behavior:
- Load model:
config = BertConfig.from_json_file(config_filename) model = BertForSequenceClassification(config) state_dict = torch.load(model_filename) model.load_state_dict(state_dict)
- Do inference twice on the same input + compare results.
- Alternatively, save the first output, load the model from scratch, and run the same inference. Even in this case, the first output will not be the same as the next time.
Expected behavior
The prediction value should be deterministic. Note that it is deterministic when the model parameters are loaded from a TensorFlow file (with from_tf=True
).
Environment info
transformers
version: 2.10.0- Platform: Linux-5.3.0-55-generic-x86_64-with-Ubuntu-19.10-eoan
- Python version: 3.7.5
- PyTorch version (GPU?): 1.5.0 (False)
- Tensorflow version (GPU?): 2.0.0 (False)
- Using GPU in script?: no
- Using distributed or parallel set-up in script?: no
Issue Analytics
- State:
- Created 3 years ago
- Comments:5 (3 by maintainers)
Top Results From Across the Web
BERT output not deterministic - deep learning - Stack Overflow
I expect the output values are deterministic when I put a same input, but my bert model the values are changing.
Read more >BERT - Hugging Face
A blog post on BERT Text Classification in a different language. ... want to create an end-to-end model that goes straight from tf.string...
Read more >Text Classification with BERT in PyTorch | by Ruben Winastwan
In this post, we're going to use a pre-trained BERT model from Hugging Face for a text classification task. As you might already...
Read more >neural network - BERT has a non deterministic behaviour
I am using the BERT implementation in https://github.com/google-research/bert for feature extracting and I have noticed a weird behaviour which ...
Read more >scikit-learn user guide
6.27 sklearn.multiclass: Multiclass and multilabel classification . ... Please do not contact the contributors of scikit-learn directly ...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Well, it depends. A few things may be responsible here:
model.eval()
), resulting in dropout layers affecting your resultsCan you check the logs by putting the following two lines above your model load?
Can you also try by using the
from_pretrained
method (given that your model filename ispytorch_model.bin
)?Or, simpler, if the configuration is in the same folder as your model filename:
The logging is useful when you’re loading using
from_pretrained
as it tells you which layers were not initialized with the model. For example if your checkpoint is a base BERT model that you try to load in the sequence classification model, it will load it but the classifier layer would be randomly initialized. The logging would have told you 😄.Glad we could resolve your problem!