Onnx converted model has its output shape modified when compared to original (finetuned) model
See original GitHub issue🐛 Bug
Information
Model I am using (Bert, XLNet …): mrm8488/distilroberta-base-finetuned-sentiment
from the hub
Language I am using the model on (English, Chinese …): English
The problem arises when using:
- the official example scripts: (give details below)
- my own modified scripts: (give details below)
I use the 04-onnx-export.ipynb
Notebook and have only change the model name and the tokenizer:
The issue appeared on all finetuned model I tried, being classification or multichoice questions.
The tasks I am working on is:
- my own task or dataset: (give details below)
- an official GLUE/SQUaD task: classification
To reproduce
Steps to reproduce the behavior:
Import AutoTokenizer, AutoModelForSequenceClassification and change tokenizer and model name, the section we are interested into:
# ...
!rm -rf onnx/
from transformers.convert_graph_to_onnx import convert
# Handles all the above steps for you
convert(framework="pt", model="mrm8488/distilroberta-base-finetuned-sentiment", output="onnx/bert-base-cased.onnx", opset=11)
# ...
from transformers import AutoTokenizer, AutoModelForSequenceClassification
tokenizer = AutoTokenizer.from_pretrained("mrm8488/distilroberta-base-finetuned-sentiment")
cpu_model = create_model_for_provider("onnx/bert-base-cased.onnx", "CPUExecutionProvider")
# Inputs are provided through numpy array
model_inputs = tokenizer.encode_plus("My name is Bert", return_tensors="pt")
inputs_onnx = {k: v.cpu().detach().numpy() for k, v in model_inputs.items()}
# Run the model (None = get all the outputs)
sequence, pooled = cpu_model.run(None, inputs_onnx)
# Print information about outputs
print(f"Sequence output: {sequence.shape}, Pooled output: {pooled.shape}")
pytorch_model = AutoModelForSequenceClassification.from_pretrained("mrm8488/distilroberta-base-finetuned-sentiment")
a, = pytorch_model(**model_inputs)
print(f"finetune non onnx pytorch model output: {a.shape}")
# ...
Expected behavior
I was expecting that the onnx output shape would be the same than the non converted model output shape, but that’s not the case:
Sequence output: (1, 6, 768), Pooled output: (1, 768)
finetune non onnx pytorch model output: torch.Size([1, 6])
It is like the last layer of the model related to the classification task is not taken in onnx.
Does it make sense? @mfuntowicz
Environment info
Google Colab with a GPU
Issue Analytics
- State:
- Created 3 years ago
- Reactions:4
- Comments:17 (8 by maintainers)
Top GitHub Comments
This seems to be related to this issue.
As @hrsmanian points it, it seems that in convert_graph_to_onnx.py, the model is currently converted by default to a ‘feature-extraction’ version where the classification layer is discarded. Changing the pipeline type (line 108 of the py file) to ‘ner’ in @hrsmanian’s case seems to have worked.
In the case of binary classification, I tried changing the pipeline type to ‘sentiment-analysis’ (my model is a binary BertForSequenceClassification) but get a ValueError (ValueError: not enough values to unpack (expected 2, got 1)) when trying to run the session. I used simpletransformers (which is based on this repo) to do binary classification with BERT, followed the instructions for conversion and inference from the blog post.
Let me know if you see what the problem is @mfuntowicz 😃
@manueltonneau You’re right, we’re currently enforcing the
feature-extraction
because not all our pipelines are compatible with ONNX graph representation.I’ll have a look asap to identify which pipelines are compatible and which are not, so what we can add the possibility to export other kind of pipeline through the script.