Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

behaviour of ZeroShotClassification using facebook/bart-large-mnli is different on online demo vs local machine

See original GitHub issue

Environment info

transformers version: 3.4.0
Platform: Ubuntu 20.04
Python version: 3.7.7
PyTorch version (GPU?): 1.6.0 (GPU:Yes)
Tensorflow version (GPU?): No
Using GPU in script?: Yes
Using distributed or parallel set-up in script?: No

Who can help

@sshleifer

Information

Model I am using (Bert, XLNet …): facebook/bart-large-mnli

The problem arises when using:

the official example scripts: (give details below)
my own modified scripts: (give details below)

The tasks I am working on is:

an official GLUE/SQUaD task: (give the name)
my own task or dataset: (give details below)

To reproduce

Steps to reproduce the behavior:

First I tried the hosted demo online at huggingface, which gives me a very high score of 0.99 for travelling (as expected): Screenshot from 2020-10-29 00-02-40

Then I tried to run the code on my local machine, which returns very different scores for all labels (poor scores):

from transformers import pipeline
from transformers import AutoTokenizer, AutoModel
tokenizer = AutoTokenizer.from_pretrained("facebook/bart-large-mnli")
model = AutoModel.from_pretrained("facebook/bart-large-mnli")

zsc = pipeline(task='zero-shot-classification', tokenizer=tokenizer, model=model)

sequences = 'one day I will see the world'
candidate_labels = ['travelling', 'cooking', 'dancing']

results = zsc(sequences=sequences, candidate_labels=candidate_labels, multi_class=False)
print(results)
>>>{'sequence': 'one day I will see the world',
'labels': ['travelling', 'dancing', 'cooking'],
'scores': [0.5285395979881287, 0.2499372661113739, 0.22152313590049744]}

I got this warning message when initializing the model: model = AutoModel.from_pretrained("facebook/bart-large-mnli")

Some weights of the model checkpoint at facebook/bart-large-mnli were not used when initializing BartModel: ['model.encoder.version', 'model.decoder.version']
- This IS expected if you are initializing BartModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPretraining model).
- This IS NOT expected if you are initializing BartModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).

Expected behavior

Code on my local machine’s score to be quite similar to the online demo.

Issue Analytics

State:
Created 3 years ago
Comments:5 (2 by maintainers)

Top GitHub Comments

1reaction

turmeric-blendcommented, Oct 28, 2020

@joeddav that fixed it thanks !

1reaction

joeddavcommented, Oct 28, 2020

Replace AutoModel with AutoModelForSequenceClassification. The former won’t add the sequence classification head, i.e. it will use BartModel instead of BartForSequenceClassification, so the pipeline is trying to use just the outputs of the encoder instead of the NLI predictions in your snippet.