Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Filling more than 1 masked token at a time

See original GitHub issue

I am able to use hugging face’s mask filling pipeline to predict 1 masked token in a sentence using the below:

!pip install -q transformers
from __future__ import print_function
import ipywidgets as widgets
from transformers import pipeline

nlp_fill = pipeline('fill-mask')
nlp_fill("I am going to guess <mask> in this sentence")

But does anyone have an opinion on what is the best way to do this if I want to predict 2 masked tokens? e.g. if the sentence is instead "I am going to <mask> <mask> in this sentence"?

If i try and put this exact sentence into nlp_fill I get the error “ValueError: only one element tensors can be converted to Python scalars” so it doesn’t work automatically.

Any help would be much appreciated!

Stack overflow question link

Issue Analytics

State:
Created 3 years ago
Reactions:5
Comments:10 (2 by maintainers)

Top GitHub Comments

2reactions

mitramir55commented, May 6, 2021

Hi, I’ve implemented right to left, left to right, and random mask filling in PyTorch for top k ids that the model thinks are the most probable tokens in a sentence in one of my projects. In this implementation, each time we want to generate a mask, the model looks at the previously generated sentences and decides what is the most probable for the next masked position. So if we have 2 masks in a sentence, by setting top_k=5, we’ll have 25 sentences (5 tokens for the first position, and for each of these 5 sentences with one mask we have another 5 tokens for the second mask). It’ll output something like this:(I used Persian models for this. I hope you can see how the masks are being filled) Then in the next step, we implemented a beam search to choose the most probable sequence of all between all these sentences.