Community contribution - `BetterTransformer` integration for more models!

`BetterTransformer` integration for more models!

BetterTransformer API provides faster inference on CPU & GPU through a simple interface!

Models can benefit from very interesting speedups using a one liner and by making sure to install the latest version of PyTorch. A complete guideline on how to convert a new model has been created on the BetterTransformer documentation!

Here is a list of models that could be potentially supported, pick one of the architecture below and let’s discuss about the conversion!

Text models 🖊️ :

FSMT - FSMTEncoderLayer / @Sumanth077 https://github.com/huggingface/optimum/pull/494
MobileBERT - MobileBertLayer / @raghavanone https://github.com/huggingface/optimum/pull/506
MBart - MBartEncoderLayer + M2M100EncoderLayer / https://github.com/huggingface/optimum/pull/516 @ravenouse
ProphetNet - ProphetNetEncoderLayer
RemBert - RemBertLayer / @hchings https://github.com/huggingface/optimum/pull/545
RocBert - RocBertLayer / @shogohida https://github.com/huggingface/optimum/pull/542
RoFormer - RoFormerLayer
Tapas - TapasLayer / https://github.com/huggingface/optimum/pull/520

Vision models 📷 :

Detr - DetrLayer
Flava - FlavaLayer / https://github.com/huggingface/optimum/pull/538
GLPN - GLPNLayer (cannot be supported)
ViLT - ViLTLayer / https://github.com/huggingface/optimum/pull/508

Audio models 🔉 :

Speech2Text - Speech2TextLayer
NEW: Audio Speech Transformer - ASTLayer / @ravenouse https://github.com/huggingface/optimum/pull/548

Let us also know if you think that some architectures can be supported that we missed. Note that for encoder-decoder based models below, we expect to convert the encoder only.

Support for decoder-based models coming soon!

cc @michaelbenayoun @fxmarty

https://github.com/huggingface/optimum/issues/488

Issue Analytics

State:
Created 10 months ago
Reactions:7
Comments:51 (35 by maintainers)

Top GitHub Comments

3reactions

younesbelkadacommented, Dec 5, 2022

Hi @ravenouse ! From what I got, this function is a C++ binding of the transformer encoder operation that is first defined here and fully defined here as you can see, the whole transformer encoder operations (self attention + ffn) is defined in a single operation

3reactions

michaelbenayouncommented, Nov 22, 2022

It is not in the list because DebertaV2 does not have a regular attention mechanism, so it is not possible to use it with BetterTransformer.

Top Results From Across the Web

Adding BetterTransformer support for new architectures

Adding BetterTransformer support for new architectures. You want to add a new model for BetterTransformer API? Check this guideline!

[P] BetterTransformer: PyTorch-native free-lunch speedups for ...

Hi everyone, In the latest PyTorch stable release 1.13, the BetterTransformer feature was marked as stable! It is a free-lunch optimization to ...

A BetterTransformer for Fast Transformer Inference - PyTorch

These fast paths are integrated in the standard PyTorch Transformer APIs, and will accelerate TransformerEncoder, TransformerEncoderLayer and ...

تويتر \ younes (younesbelkada@) - Twitter

Big kudos to anyone who has contributed to this so far How to add more models ⬇️. github.com. Community contribution - `BetterTransformer` integration...

BetterTransformer, Out of the Box Performance for Hugging ...

Hugging Face meets PyTorch to integrate 'BetterTransformer' in its ecosystem ... How to contribute and add support for more models.