question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

behaviour of ZeroShotClassification using facebook/bart-large-mnli is different on online demo vs local machine

See original GitHub issue

Environment info

  • transformers version: 3.4.0
  • Platform: Ubuntu 20.04
  • Python version: 3.7.7
  • PyTorch version (GPU?): 1.6.0 (GPU:Yes)
  • Tensorflow version (GPU?): No
  • Using GPU in script?: Yes
  • Using distributed or parallel set-up in script?: No

Who can help

@sshleifer

Information

Model I am using (Bert, XLNet …): facebook/bart-large-mnli

The problem arises when using:

  • the official example scripts: (give details below)
  • my own modified scripts: (give details below)

The tasks I am working on is:

  • an official GLUE/SQUaD task: (give the name)
  • my own task or dataset: (give details below)

To reproduce

Steps to reproduce the behavior:

First I tried the hosted demo online at huggingface, which gives me a very high score of 0.99 for travelling (as expected): Screenshot from 2020-10-29 00-02-40

Then I tried to run the code on my local machine, which returns very different scores for all labels (poor scores):

from transformers import pipeline
from transformers import AutoTokenizer, AutoModel
tokenizer = AutoTokenizer.from_pretrained("facebook/bart-large-mnli")
model = AutoModel.from_pretrained("facebook/bart-large-mnli")

zsc = pipeline(task='zero-shot-classification', tokenizer=tokenizer, model=model)

sequences = 'one day I will see the world'
candidate_labels = ['travelling', 'cooking', 'dancing']

results = zsc(sequences=sequences, candidate_labels=candidate_labels, multi_class=False)
print(results)
>>>{'sequence': 'one day I will see the world',
'labels': ['travelling', 'dancing', 'cooking'],
'scores': [0.5285395979881287, 0.2499372661113739, 0.22152313590049744]}

I got this warning message when initializing the model: model = AutoModel.from_pretrained("facebook/bart-large-mnli")

Some weights of the model checkpoint at facebook/bart-large-mnli were not used when initializing BartModel: ['model.encoder.version', 'model.decoder.version']
- This IS expected if you are initializing BartModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPretraining model).
- This IS NOT expected if you are initializing BartModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).

Expected behavior

Code on my local machine’s score to be quite similar to the online demo.

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:5 (2 by maintainers)

github_iconTop GitHub Comments

1reaction
turmeric-blendcommented, Oct 28, 2020

@joeddav that fixed it thanks !

1reaction
joeddavcommented, Oct 28, 2020

Replace AutoModel with AutoModelForSequenceClassification. The former won’t add the sequence classification head, i.e. it will use BartModel instead of BartForSequenceClassification, so the pipeline is trying to use just the outputs of the encoder instead of the NLI predictions in your snippet.

Read more comments on GitHub >

github_iconTop Results From Across the Web

facebook/bart-large-mnli
The method works by posing the sequence to be classified as the NLI premise and to construct a hypothesis from each candidate label....
Read more >
Methods for data and user efficient annotation for multi- ...
Zero-shot learning is a setup in which a Machine Learning model can make predictions for classes it was not trained to predict. Zero-...
Read more >
Zero-Shot Learning in Modern NLP | Joe Davison Blog
State-of-the-art NLP models for text classification without annotated data. ... Check out our live zero-shot topic classification demo here.
Read more >
Zero-Shot Text Classification with Hugging Face
There is a live demo from Hugging Face team, along with a sample Colab ... Zero-shot classification with transformers is straightforward, ...
Read more >
Hugging Face - Could not load model facebook/bart-large- ...
facebook/bart-large-mnli doesn't offer a TensorFlow model at the moment. To load the PyTorch model into the pipeline, make sure you have ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found