question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

`LongT5`: Efficient Text-To-Text Transformer for Long Sequences

See original GitHub issue

🌟 New model addition – LongT5: Efficient Text-To-Text Transformer for Long Sequences

Model description

LongT5 is an extension of the T5 model that handles long sequence inputs more efficiently. We integrated attention ideas from long-input transformers ETC,and adopted pre-training strategies from summarization pre-training PEGASUS into the scalable T5 architecture. The result is a new attention mechanism we call Transient Global(TGlobal), which mimics ETC’s local/globalattention mechanism, but without requiring additional side-inputs. We are able to achieve state-of-the-art results on several summarization and question answering tasks, as well as outperform the original T5 models on these tasks.

Description copied from https://github.com/google-research/longt5/blob/master/README.md.

The full paper is currently available on arXiv – LongT5: Efficient Text-To-Text Transformer for Long Sequences.

Open source status

The model has its own repository available here.

  • the model implementation is available - the model implementation is available at Google FlaxFormer repo.
  • the model weights are available: Currently, Google has released five checkpoints listed in the LongT5 repo
  • LongT5-Local-Base (250 million parameters)
  • LongT5-TGlobal-Base (250 million parameters)
  • LongT5-Local-Large (780 million parameters)
  • LongT5-TGlobal-Large (780 million parameters)
  • LongT5-TGlobal-XL (3 billion parameters)

Additional context

If anyone from the original authors won’t be interested in porting the model into the transformers, I’ll be more than happy to work on this :].

Issue Analytics

  • State:closed
  • Created a year ago
  • Reactions:11
  • Comments:8 (8 by maintainers)

github_iconTop GitHub Comments

2reactions
stancldcommented, Apr 13, 2022

@patrickvonplaten @patil-suraj I’m gonna give it a try and will try to open a draft PR as soon as I have some progress! :]

Also @patrickvonplaten, thanks a lot for all the useful links you have posted here! :]

0reactions
patil-surajcommented, Apr 11, 2022

This is super cool! Happy to help if anyone wants to give it a try 😃

Read more comments on GitHub >

github_iconTop Results From Across the Web

LongT5: Efficient Text-To-Text Transformer for Long Sequences
In this paper, we present a new model, called LongT5, with which we explore the effects of scaling both the input length and...
Read more >
LongT5: Efficient Text-To-Text Transformer for Long Sequences
In this paper, we present LongT5, a new model that explores the effects of scaling both the input length and model size at...
Read more >
google-research/longt5 - GitHub
LongT5 : Efficient Text-To-Text Transformer for Long Sequences. LongT5 is an extension of the T5 model that handles long sequence inputs more efficiently....
Read more >
LongT5 - Hugging Face
The LongT5 model was proposed in LongT5: Efficient Text-To-Text Transformer for Long Sequences by Mandy Guo, Joshua Ainslie, David Uthus, Santiago Ontanon, ...
Read more >
LongT5: Efficient Text-To-Text Transformer for ... - OpenReview
In this pa- per, we present LongT5, a new model that explores the effects of scaling both the in- put length and model...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found