The Document of LongT5 confilcts with and its example code of prefix
See original GitHub issueSystem Info
All.
Who can help?
Reproduction
See https://huggingface.co/docs/transformers/main/en/model_doc/longt5
Expected behavior
In the above document, it said Unlike the T5 model, LongT5 does not use a task prefix. Furthermore, it uses a different pre-training objective inspired by the pre-training of [PegasusForConditionalGeneration].
. But in the example code of LongT5ForConditionalGeneration
, there is a prefix of summarize:
. I am confused about how to use LongT5 in different down tasks. Could you please help? Thanks.
Issue Analytics
- State:
- Created a year ago
- Comments:7 (3 by maintainers)
Top Results From Across the Web
LongT5 - Hugging Face
It's an encoder-decoder transformer pre-trained in a text-to-text denoising ... In this paper, we present a new model, called LongT5, with which we...
Read more >LongT5: Efficient Text-To-Text Transformer for Long Sequences
In this paper, we present a new model, called LongT5, with which we explore the effects of scaling both the input length and...
Read more >T5: Exploring Limits of Transfer Learning with Text ... - YouTube
trasferlearning #t5 #googleThis paper from Google introduces T5 model (Text-to-Text Transfer Transformer) and releases large scale C4 corpus ...
Read more >LongT5: Efficient Text-To-Text Transformer for ... - YouTube
t5 #transformers #nlpLongT5 explores the effect of scaling both the input length and model size of T5 at the same time with some...
Read more >math_qa | TensorFlow Datasets
A large-scale dataset of math word problems and an interpretable neural math problem ... Additional Documentation: Explore on Papers With Code north_east.
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Hey @GabrielLin
That depends on how different the use cases are and what your limitations are exactly. In general, I’d say yes you should use different fine-tuned models for different tasks
@patrickvonplaten Got it. Thanks. This issue has been fixed and closed.