question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

MT5ForConditionalGeneration has model.config.max_length=20 by default. Why?

See original GitHub issue
  • transformers version: 4.6.1
  • Platform: Ubuntu 18
  • Python version: 3.6

I spent one week training a T5 model with this package and couldn’t figure out why my sequences obtained with Trainer.evaluate were only yielding a maximum of 20 tokens. I sent the max_length argument to the tokenizer to encode the input/output. After a long time I found out that this happens:

model = MT5ForConditionalGeneration.from_pretrained('google/mt5-small')
model.config.max_length
Out: 20

The generate method was being used in Trainer because I used predict_with_generate=True. Please change this behaviour, this was a very hard bug to find. model.config.max_length should be set to None by default, if the model does not have limitations.

Issue Analytics

  • State:closed
  • Created a year ago
  • Reactions:3
  • Comments:9 (8 by maintainers)

github_iconTop GitHub Comments

2reactions
patrickvonplatencommented, Apr 12, 2022

Sadly we cannot change this default anymore due to backward compatibility. Always having the model generate up to maximum allowed tokens can also be tricky - multiple models will always error out due to memory, some models like T5 have no max length really, … so think we’ll have to leave it at 20. Maybe we can improve the docs somehow

1reaction
patrickvonplatencommented, Apr 13, 2022

People that are familiar with generate() should know that max_length can and should be overwritten. I’ll try to make the docs better here, but I don’t think we should add a warning as this will literally be shown everytime someone calls generate without defining max_length

Read more comments on GitHub >

github_iconTop Results From Across the Web

mT5 - Hugging Face
It is used to instantiate a mT5 model according to the specified arguments, defining the model architecture. Instantiating a configuration with the defaults...
Read more >
PyTorch-Transformers
The configuration is optional. The configuration object holds information concerning the model, such as the number of heads/layers, if the model should output ......
Read more >
Asking to truncate to max_length but no maximum length is ...
Asking to truncate to max_length but no maximum length is provided and the model has no predefined maximum length. Default to no truncation....
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found