Use fill-mask pipeline to get probability of specific token
See original GitHub issueHi, I am trying to use the fill-mask pipeline:
nlp_fm = pipeline('fill-mask')
nlp_fm('Hugging Face is a French company based in <mask>')
And get the output:
[{'sequence': '<s> Hugging Face is a French company based in Paris</s>',
'score': 0.23106734454631805,
'token': 2201},
{'sequence': '<s> Hugging Face is a French company based in Lyon</s>',
'score': 0.08198195695877075,
'token': 12790},
{'sequence': '<s> Hugging Face is a French company based in Geneva</s>',
'score': 0.04769458621740341,
'token': 11559},
{'sequence': '<s> Hugging Face is a French company based in Brussels</s>',
'score': 0.04762236401438713,
'token': 6497},
{'sequence': '<s> Hugging Face is a French company based in France</s>',
'score': 0.041305914521217346,
'token': 1470}]
But let’s say I want to get the score & rank on other word - such as London - is this possible?
Issue Analytics
- State:
- Created 3 years ago
- Comments:6 (2 by maintainers)
Top Results From Across the Web
Source code for transformers.pipelines.fill_mask - Hugging Face
This mask filling pipeline can currently be loaded from :func:`~transformers.pipeline` using the following task identifier: :obj:`"fill-mask"`.
Read more >How to get a probability distribution over tokens in a ...
So to get token probabilities you can use a softmax over this, i.e. probs = torch.nn.functional.softmax(last_hidden_state[mask_index]). You can ...
Read more >How to get a probability distribution over tokens in a ... - Reddit
from transformers import pipeline # Initialize MLM pipeline mlm = pipeline('fill-mask') # Get mask token mask = mlm.tokenizer.mask_token ...
Read more >Create a Tokenizer and Train a Huggingface RoBERTa Model ...
The special tokens depend on the model, for RoBERTa we include a shortlist ... We can use the 'fill-mask' pipeline where we input...
Read more >HOW TO USE TRANSFORMER FOR REAL LIFE PROBLEMS ...
In this article, I have discussed some use cases of transformer, ... A special mask token with a probability of 0.8; A random...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Hi, the pipeline doesn’t offer such a functionality yet. You’re better off using the model directly. Here’s an example of how you would replicate the pipeline’s behavior, and get a token score at the end:
Outputs:
Let me know if it helps.
@LysandreJik I also get the error:
for this code. I have torch version 1.7.1 Any idea what is the problem? Might it be version-related? If so, what changes should be made in the code? Or what version should I downgrade to?