question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

T5 model seq2seq text generation using word embeddings instead of token_ids does not work

See original GitHub issue

Hi there,

I trained a MT5ForConditionalGeneration model. During training, I used my own embeddings for encoding (but default embeddings for decoding). However, when I try to generate output using generate function, it will give me an error message. I will post the code and error message in the following:

Here is the code for model training: outputs = self.encoder2(inputs_embeds=context, attention_mask=input_mask, labels=padded_labels) Where the context is similar to batch of token_ids but instead they are embeddings. The labels are target sequence token_ids. The training works fine without any issues.

And here is the line I tried to generate using the model: outputs = self.encoder2.generate(input_ids=None, inputs_embeds=context, attention_mask=input_mask, bos_token_id=0, pad_token_id=0, eos_token_id=1)

And once the program hits the above line, I will get the following error message:

outputs = self.encoder2.generate(input_ids=None, inputs_embeds=context, attention_mask=input_mask, bos_token_id=0, pad_token_id=0, eos_token_id=1) File “/scratch/jerryc/jerryc/venv_py3.7/lib/python3.7/site-packages/torch/autograd/grad_mode.py”, line 27, in decorate_context return func(*args, **kwargs) File “/scratch/jerryc/jerryc/venv_py3.7/lib/python3.7/site-packages/transformers/generation_utils.py”, line 913, in generate input_ids, decoder_start_token_id=decoder_start_token_id, bos_token_id=bos_token_id File “/scratch/jerryc/jerryc/venv_py3.7/lib/python3.7/site-packages/transformers/generation_utils.py”, line 422, in _prepare_decoder_input_ids_for_generation torch.ones((input_ids.shape[0], 1), dtype=torch.long, device=input_ids.device) * decoder_start_token_id AttributeError: ‘NoneType’ object has no attribute ‘shape’

It seems the model is not handling this case property. Any help would be appreciated. Thanks

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:9 (4 by maintainers)

github_iconTop GitHub Comments

1reaction
patrickvonplatencommented, Nov 18, 2021

@ichiroex,

Thanks for the nicely reproducible code snippet - this is indeed a bug and should be fixed.

0reactions
ichiroexcommented, Nov 19, 2021

@patrickvonplaten Thank you!!

Read more comments on GitHub >

github_iconTop Results From Across the Web

T5 - Hugging Face
T5 is an encoder-decoder model and converts all NLP problems into a text-to-text format. It is trained using teacher forcing. This means that...
Read more >
Word embeddings with Google's T5? - nlp - Stack Overflow
Yes, that is possible. Just feed the ids of the words to the word embedding layer: from transformers import T5TokenizerFast, ...
Read more >
Neural machine translation with a Transformer and Keras | Text
The Transformer starts by generating initial representations, or embeddings, for each word... Then, using self-attention, it aggregates information from all of ...
Read more >
Abstractive Summarization with Hugging Face Transformers
Following prior work, we aim to tackle this problem using a sequence-to-sequence model. Text-to-Text Transfer Transformer ( T5 ) is a ...
Read more >
Reconstructing Text from Contextualized Word Embeddings ...
word embeddings produced by transformer- ... from the year 1800 to 1920 due to technical issues ... language models instead, since the search...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found