Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Feature request: Add built-in support for autorregressive text generation with ONNX models

See original GitHub issue

🚀 Add built-in support for autorregressive text generation with ONNX models.

After converting a autorregressive model to ONNX, it would be nice to be able to generate text with it via something like:

from transformers import OnnxTextGenerationModel, AutoTokenizer

model_path = "gpt-something.onnx"
tokenizer_name = "gpt2"

model = OnnxTextGenerationModel(model_path)

# and then

text = "Replace me by any text you'd like."
encoded_input = tokenizer(text, return_tensors='tf')
output = model.generate(encoded_input)

With support to using past_key_values internally in the most efficient way.

Motivation

When trying to accelerate inference with transformers, being unable to load our ONNX model with the lib and running a model.generate method to seamlessly generate sequences and perform Beam Search is somehow frustrating. That leads us to have to rely on custom implementations - which takes time and are a lot more prone to have bugs.

We can try to hack a subclass of GenerationMixin, but having to convert things to and from PyTorch makes everything too slow.

Your contribution

I can try submitting a PR, but this will take long, as I work full-time and might not have enough time to make it fast.

Issue Analytics

State:
Created 2 years ago
Comments:8 (8 by maintainers)

Top GitHub Comments

4reactions

michaelbenayouncommented, Dec 1, 2021

Yes, this is planned. Nice to know that there is interest for such features!

Pinging @lewisbails and @philschmid as they were the ones suggesting to add those kind of features to optimum.

1reaction

piEspositocommented, Dec 9, 2021

We are following this discussion on https://github.com/huggingface/optimum/issues/55 .

Top Results From Across the Web

Using onnx for text-generation with GPT-2 - Transformers

Hi @valhalla @patrickvonplaten , I was working with onnx_transformers and using onnx for GPT-2 model and text-generation task.

Amazon SageMaker – AWS Machine Learning Blog

SageMaker supports up to 10 production variants per endpoint. ... You can select the models you have created in the 'Add Model' dialog...

NVIDIA Deep Learning TensorRT Documentation

When building an engine, you can specify that it may later have its weights updated. This can be useful if you are frequently...

LightningModule - PyTorch Lightning - Read the Docs

Module but with added functionality. ... you can do a much more involved inference procedure, such as text generation: ... Saves the model...

Available CRAN Packages By Name

ACDm, Tools for Autoregressive Conditional Duration Models ... bsplus, Adds Functionality to the R Markdown + Shiny Bootstrap Framework.

Troubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.

Start Free

Top Related Reddit Thread

No results found

Top Related Tweet

No results found

Top Related Dev.to Post

No results found

Feature request: Add built-in support for autorregressive text generation with ONNX models

🚀 Add built-in support for autorregressive text generation with ONNX models.

Motivation

Your contribution

Issue Analytics

Top GitHub Comments

Top Results From Across the Web

Top Related Medium Post

Top Related StackOverflow Question

Troubleshoot Live Code

Top Related Reddit Thread

Top Related Hackernoon Post

Top Related Tweet

Top Related Dev.to Post

Top Related Hashnode Post

BART + ONNX torch.jit error iterabletree cannot be used as a value

DPR usage of BertPooler