question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Feature request: Add built-in support for autorregressive text generation with ONNX models

See original GitHub issue

🚀 Add built-in support for autorregressive text generation with ONNX models.

After converting a autorregressive model to ONNX, it would be nice to be able to generate text with it via something like:

from transformers import OnnxTextGenerationModel, AutoTokenizer

model_path = "gpt-something.onnx"
tokenizer_name = "gpt2"

model = OnnxTextGenerationModel(model_path)

# and then

text = "Replace me by any text you'd like."
encoded_input = tokenizer(text, return_tensors='tf')
output = model.generate(encoded_input)

With support to using past_key_values internally in the most efficient way.

Motivation

When trying to accelerate inference with transformers, being unable to load our ONNX model with the lib and running a model.generate method to seamlessly generate sequences and perform Beam Search is somehow frustrating. That leads us to have to rely on custom implementations - which takes time and are a lot more prone to have bugs.

We can try to hack a subclass of GenerationMixin, but having to convert things to and from PyTorch makes everything too slow.

Your contribution

I can try submitting a PR, but this will take long, as I work full-time and might not have enough time to make it fast.

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:8 (8 by maintainers)

github_iconTop GitHub Comments

4reactions
michaelbenayouncommented, Dec 1, 2021

Yes, this is planned. Nice to know that there is interest for such features!

Pinging @lewisbails and @philschmid as they were the ones suggesting to add those kind of features to optimum.

1reaction
piEspositocommented, Dec 9, 2021

We are following this discussion on https://github.com/huggingface/optimum/issues/55 .

Read more comments on GitHub >

github_iconTop Results From Across the Web

Using onnx for text-generation with GPT-2 - Transformers
Hi @valhalla @patrickvonplaten , I was working with onnx_transformers and using onnx for GPT-2 model and text-generation task.
Read more >
Amazon SageMaker – AWS Machine Learning Blog
SageMaker supports up to 10 production variants per endpoint. ... You can select the models you have created in the 'Add Model' dialog...
Read more >
NVIDIA Deep Learning TensorRT Documentation
When building an engine, you can specify that it may later have its weights updated. This can be useful if you are frequently...
Read more >
LightningModule - PyTorch Lightning - Read the Docs
Module but with added functionality. ... you can do a much more involved inference procedure, such as text generation: ... Saves the model...
Read more >
Available CRAN Packages By Name
ACDm, Tools for Autoregressive Conditional Duration Models ... bsplus, Adds Functionality to the R Markdown + Shiny Bootstrap Framework.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found