question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Using the T5 model with huggingface's mask-fill pipeline

See original GitHub issue

Does anyone know if it is possible to use the T5 model with hugging face’s mask-fill pipeline? The below is how you can do it using the default model but i can’t seem to figure out how to do is using the T5 model specifically?

from transformers import pipeline
nlp_fill = pipeline('fill-mask')
nlp_fill('Hugging Face is a French company based in ' + nlp_fill.tokenizer.mask_token)

Trying this for example raises the error “TypeError: must be str, not NoneType” because nlp_fill.tokenizer.mask_token is None.

nlp_fill = pipeline('fill-mask',model="t5-base", tokenizer="t5-base")
nlp_fill('Hugging Face is a French company based in ' + nlp_fill.tokenizer.mask_token)

Stack overflow question

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:18 (8 by maintainers)

github_iconTop GitHub Comments

31reactions
girishponkiyacommented, May 2, 2020

Could we use the following workaround?

  • <extra_id_0> could be considered as a mask token
  • Candidate sequences for the mask-token could be generated using a code, like:
from transformers import T5Tokenizer, T5Config, T5ForConditionalGeneration

T5_PATH = 't5-base' # "t5-small", "t5-base", "t5-large", "t5-3b", "t5-11b"

DEVICE = torch.device('cuda' if torch.cuda.is_available() else 'cpu') # My envirnment uses CPU

t5_tokenizer = T5Tokenizer.from_pretrained(T5_PATH)
t5_config = T5Config.from_pretrained(T5_PATH)
t5_mlm = T5ForConditionalGeneration.from_pretrained(T5_PATH, config=t5_config).to(DEVICE)

# Input text
text = 'India is a <extra_id_0> of the world. </s>'

encoded = t5_tokenizer.encode_plus(text, add_special_tokens=True, return_tensors='pt')
input_ids = encoded['input_ids'].to(DEVICE)

# Generaing 20 sequences with maximum length set to 5
outputs = t5_mlm.generate(input_ids=input_ids, 
                          num_beams=200, num_return_sequences=20,
                          max_length=5)

_0_index = text.index('<extra_id_0>')
_result_prefix = text[:_0_index]
_result_suffix = text[_0_index+12:]  # 12 is the length of <extra_id_0>

def _filter(output, end_token='<extra_id_1>'):
    # The first token is <unk> (inidex at 0) and the second token is <extra_id_0> (indexed at 32099)
    _txt = t5_tokenizer.decode(output[2:], skip_special_tokens=False, clean_up_tokenization_spaces=False)
    if end_token in _txt:
        _end_token_index = _txt.index(end_token)
        return _result_prefix + _txt[:_end_token_index] + _result_suffix
    else:
        return _result_prefix + _txt + _result_suffix

results = list(map(_filter, outputs))
results

Output:

['India is a cornerstone of the world. </s>',
 'India is a part of the world. </s>',
 'India is a huge part of the world. </s>',
 'India is a big part of the world. </s>',
 'India is a beautiful part of the world. </s>',
 'India is a very important part of the world. </s>',
 'India is a part of the world. </s>',
 'India is a unique part of the world. </s>',
 'India is a part of the world. </s>',
 'India is a part of the world. </s>',
 'India is a beautiful country in of the world. </s>',
 'India is a part of the of the world. </s>',
 'India is a small part of the world. </s>',
 'India is a part of the world. </s>',
 'India is a part of the world. </s>',
 'India is a country in the of the world. </s>',
 'India is a large part of the world. </s>',
 'India is a part of the world. </s>',
 'India is a significant part of the world. </s>',
 'India is a part of the world. </s>']
3reactions
klimentijcommented, May 13, 2020

@girishponkiya Thanks for your example! Unfortunately, I can’t reproduce your results. I get

['India is a _0> of the world. </s>',
 'India is a  ⁇ extra of the world. </s>',
 'India is a India is  of the world. </s>',
 'India is a  ⁇ extra_ of the world. </s>',
 'India is a a  of the world. </s>',
 'India is a [extra_ of the world. </s>',
 'India is a India is an of the world. </s>',
 'India is a of the world of the world. </s>',
 'India is a India. of the world. </s>',
 'India is a is a of the world. </s>',
 'India is a India  ⁇  of the world. </s>',
 'India is a Inde is  of the world. </s>',
 'India is a ] of the of the world. </s>',
 'India is a . of the world. </s>',
 'India is a _0 of the world. </s>',
 'India is a is  ⁇  of the world. </s>',
 'India is a india is  of the world. </s>',
 'India is a India is the of the world. </s>',
 'India is a -0> of the world. </s>',
 'India is a  ⁇ _ of the world. </s>']

Tried on CPU, GPU, ‘t5-base’ and ‘t5-3b’ — same thing.

Read more comments on GitHub >

github_iconTop Results From Across the Web

T5 - Hugging Face
T5 is an encoder-decoder model and converts all NLP problems into a text-to-text format. It is trained using teacher forcing. This means that...
Read more >
Using the T5 model with huggingface's mask-fill pipeline
Does anyone know if it is possible to use the T5 model with hugging face's mask-fill pipeline? The below is how you can...
Read more >
Deploy T5 11B for inference for less than $500 - philschmid
This blog will teach you how to deploy T5 11B for inference using Hugging Face Inference Endpoints. The T5 model was presented in...
Read more >
Fine-Tuning T5 for Question Answering using HuggingFace ...
Prepare for the Machine Learning interview: https://mlexpert.io Subscribe: http://bit.ly/venelin-subscribe Get SH*T Done with PyTorch ...
Read more >
Abstractive Summarization with Hugging Face Transformers
If you are using one of the five T5 checkpoints we have to prefix the inputs with "summarize:" (the model can also translate...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found