question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

facebook/bart-large-mnli input format

See original GitHub issue

Hi folks,

First off, I’ve been using you guys since the early days and think the effort and time that you put in is just phenomenal. Thank you. All the postgrads I know at the Uni of Edinburgh love HuggingFace.

My question concerns the usage of the facebook/bart-large-mnli checkpoint - specifically the input formatting. The paper mentions that inputs are concatenated and appended with an EOS token, which is then passed to the classification head.

Something like below perhaps? If this is the case, the probabilities do not seem right, seeing as the first two sentences are the exact same.

from transformers import AutoModelForSequenceClassification, AutoTokenizer, AutoModel
import torch

t = AutoTokenizer.from_pretrained("facebook/bart-large-mnli")
mc = AutoModelForSequenceClassification.from_pretrained("facebook/bart-large-mnli")

s1 = torch.tensor(t("i am good. [EOS] i am good.", padding="max_length")["input_ids"])
s2 = torch.tensor(t("i am good. [EOS] i am NOT good.", padding="max_length")["input_ids"])
s3 = torch.tensor(t("i am good. [EOS] i am bad.", padding="max_length")["input_ids"])

with torch.no_grad():
  logits = mc(torch.stack((s1,s2,s3)), output_hidden_states=True)[0]

sm = torch.nn.Softmax()
print(sm(logits)) 
# tensor([[0.2071, 0.3143, 0.4786],      # these sentences are the exact same, so why just 0.47?
#             [0.6478, 0.1443, 0.2080],       # slightly better, but this checkpoint gets ~80% acc on MNLI
#             [0.3937, 0.2987, 0.3076]])      # This distribution is almost random, but the sentences are the exact opposite

I note that [EOS] is not registered with the tokenizer special tokens. When I use the registers <s> or </s> I get similar results

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:7 (4 by maintainers)

github_iconTop GitHub Comments

1reaction
sshleifercommented, Jul 14, 2020

Interesting. Happy to look into it if there’s a bug, but otherwise I think this is just a model issue. (Bug = the prediction is very different from the fairseq model for the same input).

0reactions
stale[bot]commented, Sep 24, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

Read more comments on GitHub >

github_iconTop Results From Across the Web

facebook/bart-large-mnli - Hugging Face
The method works by posing the sequence to be classified as the NLI premise and to construct a hypothesis from each candidate label....
Read more >
Transformers BART Model Explained for Text Summarization
The model's input and output are in the form of a sequence (text), ... We use “summarization” and the model as “facebook/bart-large-xsum”.
Read more >
Three Text Classification Techniques That Require Little to no ...
NLI involves determining if an input contradicts, is neutral to, ... model = "facebook/bart-large-mnli" classifier = pipeline(task, model).
Read more >
Sentiment Analysis with BART - COVID19 - Kaggle
Some weights of the model checkpoint at facebook/bart-large-mnli were not used when initializing BartForSequenceClassification: ['model.encoder.version', 'model ...
Read more >
R, Reticulate, and Hugging Face Models - Cengiz Zopluoglu
Each NLP model may have a different format for the masked word. ... Resources: https://huggingface.co/facebook/bart-large-mnli ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found