Add marian model type support for ORTOptimizer
See original GitHub issueFeature request
The recent Optimum library enable to export marian model type to ONNX format. However, the optimization via ORTOptimizer seems not to support this model type.
Please find below, the code used:
from optimum.onnxruntime import ORTModelForSeq2SeqLM, ORTOptimizer
from optimum.onnxruntime.configuration import OptimizationConfig
model_id="Helsinki-NLP/opus-mt-fr-en"
# load vanilla transformers and convert to onnx
model = ORTModelForSeq2SeqLM.from_pretrained(model_id, from_transformers=True)
# Create ORTOptimizer
optimizer = ORTOptimizer.from_pretrained(model)
# Define the optimization strategy by creating the appropriate configuration
optimization_config = OptimizationConfig(optimization_level=1,
optimize_for_gpu=True,
fp16=True
)
# Optimize the model
optimizer.optimize(save_dir=onnx_path, optimization_config=optimization_config)
And the error code on feedback:
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
<ipython-input-8-0c22ffd4b88a> in <module>
12
13 # Optimize the model
---> 14 optimizer.optimize(save_dir=onnx_path, optimization_config=optimization_config)
~/anaconda3/envs/sts-transformers-gpu-fresh/lib/python3.8/site-packages/optimum/onnxruntime/optimization.py in optimize(self, optimization_config, save_dir, file_suffix, use_external_data_format)
113 save_dir.mkdir(parents=True, exist_ok=True)
114 model_type = self.config.model_type
--> 115 ORTConfigManager.check_supported_model_or_raise(model_type)
116
117 # Save the model configuration
~/anaconda3/envs/sts-transformers-gpu-fresh/lib/python3.8/site-packages/optimum/onnxruntime/utils.py in check_supported_model_or_raise(cls, model_type)
110 def check_supported_model_or_raise(cls, model_type: str) -> bool:
111 if model_type not in cls._conf:
--> 112 raise KeyError(
113 f"{model_type} model type is not supported yet. Only {list(cls._conf.keys())} are supported. "
114 f"If you want to support {model_type} please propose a PR or open up an issue."
KeyError: "marian model type is not supported yet. Only ['bert', 'albert', 'big_bird', 'camembert', 'codegen', 'distilbert', 'deberta', 'deberta-v2', 'electra', 'roberta', 'bart', 'gpt2', 'gpt_neo', 'xlm-roberta'] are supported. If you want to support marian please propose a PR or open up an issue."
Motivation
It would be greatly beneficial to be able to optimize the onnx models regarding the low inference speed of this kind of model.
Your contribution
I could be a test user if the library is updated with this new feature.
Issue Analytics
- State:
- Created a year ago
- Comments:5 (3 by maintainers)
Top Results From Across the Web
MarianMT - Hugging Face
Models were originally trained by Jörg Tiedemann using the Marian C++ library, which supports fast training and translation. All models are transformer encoder- ......
Read more >ONNXConfig: Add a configuration for all available models
ONNXConfig: Add a configuration for all available models #16308 ... Add marian model type support for ORTOptimizer huggingface/optimum#392.
Read more >Documentation - Marian NMT
Depending on the model type, Marian support multiple types of dropout. For RNN-based models it supports the --dropout-rnn 0.2 (the numeric ...
Read more >Optimizing Transformers for GPUs with Optimum - philschmid
Convert a Hugging Face Transformers model to ONNX for inference; Optimize model for GPU using ORTOptimizer; Evaluate the performance and speed.
Read more >optimum Changelog - pyup.io
A notebook showing how to train a model supported in the librarie (171) Documentation ... Add mT5 (341) and Marian (393) support to...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Hi @Matthieu-Tinycoaching,
Being able to export a model to ONNX format means that it already has its ONNX Config implemented. As you can see here the ONNX config for Marian.
If it is not the case, you can open a pull request in Transformers to add it on(basically you need to precise the inputs, outputs, the opset and create a method to generate dummy inputs). You can find more details here, and more discussion here.
Sure, you can build optimum from source with the following: