Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Onnx support?

See original GitHub issue

Hi, really like this work!

Given its advantage on faster inference, have you considered adding support functions, like the example below, to compile SetFitTrainer into the onnx format for production-wise usage?

If that sounds promising, I will be happy to make this feature work!

Example:

# Train 
trainer.train()

# Compile to onnx
onnx_path = "path/to/store/compiled/model.onnx"
trainer.to_onnx(onnx_path, **onnx_related_kwargs)

Issue Analytics

State:
Created a year ago
Comments:15 (12 by maintainers)

Top GitHub Comments

7reactions

nbertagnollicommented, Oct 17, 2022

I’m not sure if this is helpful, but I was working on deploying some of these models using ONNX and this is what I came up with so far. If others are looking for a place to start here is some code that will convert the base model and the head and then you can run them separately. I haven’t been able to merge them into one graph yet but hopefully it’s a start while we wait for #8 😃.

from skl2onnx import convert_sklearn
from skl2onnx.common.data_types import FloatTensorType
from onnxruntime import InferenceSession
from pathlib import Path
from setfit import SetFitModel
from transformers import AutoTokenizer
from transformers.convert_graph_to_onnx import convert
import torch
import numpy as np
import sys


def mean_pooling_np(model_output: np.array, attention_mask: np.array):
    token_embeddings = model_output[0] 
    input_mask_expanded = np.broadcast_to(
        np.expand_dims(attention_mask, axis=2), token_embeddings.shape
    )
    sum_embeddings = np.sum(input_mask_expanded * token_embeddings, axis=1)
    sum_mask = np.clip(input_mask_expanded.sum(1), 1e-9, sys.maxint)
    return sum_embeddings / sum_mask


trained_model_path = "path/to/your/trained/setfit/model"
onnx_sentence_model_path = "/path/to/save/onnx/to"
onnx_head_model_path = "/path/to/save/onnx/to"

model = SetFitModel.from_pretrained(trained_model_path)

# Convert the sentence transformer model to onnx
convert(
    "pt",
    trained_model_path,
    Path(onnx_sentence_model_path).absolute(),
    15,
    trained_model_path,
)

# Convert sklearn head into ONNX format
initial_type = [("model_head", FloatTensorType([None, 768]))]
onx = convert_sklearn(model.model_head, initial_types=initial_type, target_opset=15)
with open(onnx_head_model_path, "wb") as f:
    f.write(onx.SerializeToString())


# Load and use the models
text = ["some text to do stuff with"]
tokenizer = AutoTokenizer.from_pretrained(
    "sentence-transformers/paraphrase-mpnet-base-v2"
)
session = InferenceSession(onnx_sentence_model_path)
head_session = InferenceSession(onnx_head_model_path)
tokens = tokenizer(text, truncation=True, return_tensors="np")
preds = session.run(None, dict(tokens))
pooled_preds = mean_pooling_np(preds, tokens["attention_mask"])
print(head_session.run(None, {"model_head": pooled_preds}))

2reactions

nbertagnollicommented, Nov 9, 2022

@AnshulP10 please take a look at the PR we’ve been working on #156. @kgourgou pointed out the above script has some things that you need to modify for some models. This PR hopefully addresses those concerns. In the PR there is a function called export_onnx which should do what you want. Let me know if you still have trouble.

Top Results From Across the Web

ONNX | Home

ONNX makes it easier to access hardware optimizations. Use ONNX-compatible runtimes and libraries designed to maximize performance across hardware. SUPPORTED ...

onnx/onnx: Open standard for machine learning interoperability

ONNX is widely supported and can be found in many frameworks, tools, and hardware. Enabling interoperability between different frameworks and streamlining ...

ONNX models - Microsoft Learn

Windows Machine Learning supports models in the Open Neural Network Exchange (ONNX) format. ONNX is an open format for ML models, ...

ONNX Runtime | Home

Support for a variety of frameworks, operating systems and hardware ... Please help us improve ONNX Runtime by participating in our customer survey....

ONNX: Easily Exchange Deep Learning Models

ONNX (Open Neural Network Exchange Format) is a format designed to represent any type of Machine Learning and Deep Learning model.