fill-mask target for full words not enabled?
See original GitHub issueSystem Info
- `transformers` version: 4.19.2
- Platform: Linux-5.4.188+-x86_64-with-Ubuntu-18.04-bionic
- Python version: 3.7.13
- Huggingface_hub version: 0.6.0
- PyTorch version (GPU?): 1.11.0+cu113 (False)
- Tensorflow version (GPU?): 2.8.0 (False)
- Flax version (CPU?/GPU?/TPU?): not installed (NA)
- Jax version: not installed
- JaxLib version: not installed
- Using GPU in script?: <fill in>
- Using distributed or parallel set-up in script?: <fill in>
Who can help?
@Narsil and @LysandreJik (?) How can one use Roberta for fill-mask to get the full word candidate and its “full” score for Roberta-large? Open to workaround solutions.
My example:
sentence = f"Nitzsch argues against the doctrine of the annihilation of the wicked, regards the teaching of Scripture about eternal {nlp.tokenizer.mask_token} as hypothetical."
Notebook here.
Using pipeline, the output I get is:
The specified target token
damnationdoes not exist in the model vocabulary. Replacing with
Ġdamn.
Thanks.
Information
- The official example scripts
- My own modified scripts
Tasks
- An officially supported task in the
examples
folder (such as GLUE/SQuAD, …) - My own task or dataset (give details below)
Reproduction
See notebook above.
Expected behavior
I expect to see "damnation" with its score.
Issue Analytics
- State:
- Created a year ago
- Comments:6 (3 by maintainers)
Top Results From Across the Web
What is Fill-Mask? - Hugging Face
Fill-Mask. Masked language modeling is the task of masking some of the words in a sentence and predicting which words should replace those...
Read more >Using huggingface fill-mask pipeline to get the "score" for a ...
I've been using huggingface to make predictions for masked tokens and it works great. I noticed that for each prediction it gives a...
Read more >A complete tutorial on masked language modelling using BERT
Masked image modelling is a way to perform word prediction that was originally hidden intentionally in a sentence.
Read more >Masked-Language Modeling With BERT | by James Briggs
BERT may not know what Autumn, trees, and leaves are — but it does know that given linguistic patterns, and the context of...
Read more >Negation in the brain: Modulating action representations
(2005) did not report the length of their sentence conditions, ... in the context of a fill-mask task that hides target words from...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
I hear you @Narsil, it sure is non-trivial.
In my case, I would like a large-enough LM (for example, Roberta-large) to generate word candidates to start with, given some regex as hints/constraints, without knowing in advance what the best candidates are, except for those hints. My thinking is that the candidates the LM generates would more or less already fit into the context given to the model. Multiple candidates would be ranked post-fill by their scores.
Re
zero-shot-classification
, the trouble is without knowing in advance what the correct/best candidates are, it’s more difficult to work it in.This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.