question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Hi, really like this work!

Given its advantage on faster inference, have you considered adding support functions, like the example below, to compile SetFitTrainer into the onnx format for production-wise usage?

If that sounds promising, I will be happy to make this feature work!

Example:

# Train 
trainer.train()

# Compile to onnx
onnx_path = "path/to/store/compiled/model.onnx"
trainer.to_onnx(onnx_path, **onnx_related_kwargs)

Issue Analytics

  • State:closed
  • Created a year ago
  • Comments:15 (12 by maintainers)

github_iconTop GitHub Comments

7reactions
nbertagnollicommented, Oct 17, 2022

I’m not sure if this is helpful, but I was working on deploying some of these models using ONNX and this is what I came up with so far. If others are looking for a place to start here is some code that will convert the base model and the head and then you can run them separately. I haven’t been able to merge them into one graph yet but hopefully it’s a start while we wait for #8 😃.

from skl2onnx import convert_sklearn
from skl2onnx.common.data_types import FloatTensorType
from onnxruntime import InferenceSession
from pathlib import Path
from setfit import SetFitModel
from transformers import AutoTokenizer
from transformers.convert_graph_to_onnx import convert
import torch
import numpy as np
import sys


def mean_pooling_np(model_output: np.array, attention_mask: np.array):
    token_embeddings = model_output[0] 
    input_mask_expanded = np.broadcast_to(
        np.expand_dims(attention_mask, axis=2), token_embeddings.shape
    )
    sum_embeddings = np.sum(input_mask_expanded * token_embeddings, axis=1)
    sum_mask = np.clip(input_mask_expanded.sum(1), 1e-9, sys.maxint)
    return sum_embeddings / sum_mask


trained_model_path = "path/to/your/trained/setfit/model"
onnx_sentence_model_path = "/path/to/save/onnx/to"
onnx_head_model_path = "/path/to/save/onnx/to"

model = SetFitModel.from_pretrained(trained_model_path)

# Convert the sentence transformer model to onnx
convert(
    "pt",
    trained_model_path,
    Path(onnx_sentence_model_path).absolute(),
    15,
    trained_model_path,
)

# Convert sklearn head into ONNX format
initial_type = [("model_head", FloatTensorType([None, 768]))]
onx = convert_sklearn(model.model_head, initial_types=initial_type, target_opset=15)
with open(onnx_head_model_path, "wb") as f:
    f.write(onx.SerializeToString())


# Load and use the models
text = ["some text to do stuff with"]
tokenizer = AutoTokenizer.from_pretrained(
    "sentence-transformers/paraphrase-mpnet-base-v2"
)
session = InferenceSession(onnx_sentence_model_path)
head_session = InferenceSession(onnx_head_model_path)
tokens = tokenizer(text, truncation=True, return_tensors="np")
preds = session.run(None, dict(tokens))
pooled_preds = mean_pooling_np(preds, tokens["attention_mask"])
print(head_session.run(None, {"model_head": pooled_preds}))
2reactions
nbertagnollicommented, Nov 9, 2022

@AnshulP10 please take a look at the PR we’ve been working on #156. @kgourgou pointed out the above script has some things that you need to modify for some models. This PR hopefully addresses those concerns. In the PR there is a function called export_onnx which should do what you want. Let me know if you still have trouble.

Read more comments on GitHub >

github_iconTop Results From Across the Web

ONNX | Home
ONNX makes it easier to access hardware optimizations. Use ONNX-compatible runtimes and libraries designed to maximize performance across hardware. SUPPORTED ...
Read more >
onnx/onnx: Open standard for machine learning interoperability
ONNX is widely supported and can be found in many frameworks, tools, and hardware. Enabling interoperability between different frameworks and streamlining ...
Read more >
ONNX models - Microsoft Learn
Windows Machine Learning supports models in the Open Neural Network Exchange (ONNX) format. ONNX is an open format for ML models, ...
Read more >
ONNX Runtime | Home
Support for a variety of frameworks, operating systems and hardware ... Please help us improve ONNX Runtime by participating in our customer survey....
Read more >
ONNX: Easily Exchange Deep Learning Models
ONNX (Open Neural Network Exchange Format) is a format designed to represent any type of Machine Learning and Deep Learning model.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found