Onnx T5 for Generation
See original GitHub issueEnvironment info
adapter-transformers
version: 2.1.2
- Platform: Windows-10-10.0.19041-SP0
- Python version: 3.7.5
- PyTorch version (GPU?): 1.8.1+cpu (False)
- Tensorflow version (GPU?): 2.3.0 (False)
- Flax version (CPU?/GPU?/TPU?): not installed (NA)
- Jax version: not installed
- JaxLib version: not installed
- Using GPU in script?: No
- Using distributed or parallel set-up in script?: No
Who can help
@patrickvonplaten, @patil-suraj
Information
I want to use the to Onnx converted T5 model for generation, but I can only pass decoder_input_ids with a sequence length of 1.
To reproduce
Steps to reproduce the behavior:
- Convert the T5 model to onnx:
python -m transformers.onnx --model=t5-base --feature=seq2seq-lm onnx/t5-base/
- Load the onnx model with
onnxruntime
:session = onnxruntime.InferenceSession('onnx/t5-base/model.onnx')
- Pass the model an input with a decoder sequence with more than one element:
tokenizer = AutoTokenizer.from_pretrained("t5-base")
encoder_input = tokenizer("This is some text.", return_tensors="np")
decoder_inputs = tokenizer("bla bla", return_tensors="np")
print(decoder_inputs)
model_input = {
"input_ids": encoder_input["input_ids"],
"attention_mask": encoder_input["attention_mask"],
"decoder_input_ids": decoder_inputs["input_ids"],
"decoder_attention_mask": decoder_inputs["attention_mask"]
}
outputs = session.run([], model_input)
Expected behavior
I would expect there to be a way to pass multiple decoder_input_ids to the model to generate text. How is this intended to be done?
Issue Analytics
- State:
- Created 2 years ago
- Reactions:3
- Comments:8 (5 by maintainers)
Top Results From Across the Web
echarlaix/t5-small-onnx - Hugging Face
T5 is an encoder-decoder model pre-trained on a multi-task mixture of unsupervised and supervised tasks and for which each task is converted ...
Read more >onnxt5 - PyPI
Summarization, translation, Q&A, text generation and more at blazing speed using a T5 version implemented in ONNX. This package is still in alpha...
Read more >Optimizing the T5 Model for Fast Inference - DataToBiz
We were using the T5 model in our product PrepAI which is a question generation platform where users can upload different kinds of...
Read more >Fast T5 transformer model CPU inference with ... - YouTube
With the conversion of T5 transformer model to ONNX and ... This helps to deploy a question generation model like T5 on a...
Read more >Training T5 model in just 3 lines of Code with ONNX Inference
simpleT5 is a python package built on top of PyTorch-lightning and Hugging Face Transformers that lets you quickly(in just 3 lines of code) ......
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
I see. What would I use for the first token that is generated as past key values?
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.