Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

T5 model seq2seq text generation using word embeddings instead of token_ids does not work

See original GitHub issue

Hi there,

I trained a MT5ForConditionalGeneration model. During training, I used my own embeddings for encoding (but default embeddings for decoding). However, when I try to generate output using generate function, it will give me an error message. I will post the code and error message in the following:

Here is the code for model training: outputs = self.encoder2(inputs_embeds=context, attention_mask=input_mask, labels=padded_labels) Where the context is similar to batch of token_ids but instead they are embeddings. The labels are target sequence token_ids. The training works fine without any issues.

And here is the line I tried to generate using the model: outputs = self.encoder2.generate(input_ids=None, inputs_embeds=context, attention_mask=input_mask, bos_token_id=0, pad_token_id=0, eos_token_id=1)

And once the program hits the above line, I will get the following error message:

outputs = self.encoder2.generate(input_ids=None, inputs_embeds=context, attention_mask=input_mask, bos_token_id=0, pad_token_id=0, eos_token_id=1) File “/scratch/jerryc/jerryc/venv_py3.7/lib/python3.7/site-packages/torch/autograd/grad_mode.py”, line 27, in decorate_context return func(*args, **kwargs) File “/scratch/jerryc/jerryc/venv_py3.7/lib/python3.7/site-packages/transformers/generation_utils.py”, line 913, in generate input_ids, decoder_start_token_id=decoder_start_token_id, bos_token_id=bos_token_id File “/scratch/jerryc/jerryc/venv_py3.7/lib/python3.7/site-packages/transformers/generation_utils.py”, line 422, in _prepare_decoder_input_ids_for_generation torch.ones((input_ids.shape[0], 1), dtype=torch.long, device=input_ids.device) * decoder_start_token_id AttributeError: ‘NoneType’ object has no attribute ‘shape’

It seems the model is not handling this case property. Any help would be appreciated. Thanks

Issue Analytics

State:
Created 2 years ago
Comments:9 (4 by maintainers)

Top GitHub Comments

1reaction

patrickvonplatencommented, Nov 18, 2021

@ichiroex,

Thanks for the nicely reproducible code snippet - this is indeed a bug and should be fixed.

0reactions

ichiroexcommented, Nov 19, 2021

@patrickvonplaten Thank you!!

Top Results From Across the Web

T5 - Hugging Face

T5 is an encoder-decoder model and converts all NLP problems into a text-to-text format. It is trained using teacher forcing. This means that...

Word embeddings with Google's T5? - nlp - Stack Overflow

Yes, that is possible. Just feed the ids of the words to the word embedding layer: from transformers import T5TokenizerFast, ...

Neural machine translation with a Transformer and Keras | Text

The Transformer starts by generating initial representations, or embeddings, for each word... Then, using self-attention, it aggregates information from all of ...

Abstractive Summarization with Hugging Face Transformers

Following prior work, we aim to tackle this problem using a sequence-to-sequence model. Text-to-Text Transfer Transformer ( T5 ) is a ...

Reconstructing Text from Contextualized Word Embeddings ...

word embeddings produced by transformer- ... from the year 1800 to 1920 due to technical issues ... language models instead, since the search...