question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Triton Server Docker Image with ONNXRuntime support

See original GitHub issue

Is your feature request related to a problem? Please describe.

Currently, we are using the full tritonserver docker image (i.e. xx.yy-py3) in order to use our ONNXRuntime models. However, the image size is too large for us and we would like to decrease it by having a tritonserver docker image that only supports ONNXRuntime models, similar to how there is xx.yy-tf2-python-py3 for TensorFlow 2.x and xx.yy-pyt-python-py3 for PyTorch.

One constraint we have is that we do not want to manually manage the tritonserver image updates by cloning the git repository and using the compose.py file in the near future. That is why we prefer to have an official image from Nvidia’s Container Registry.

Describe the solution you’d like

  • An officially supported tritonserver docker image with ONNXRuntime and Python backends only.

Describe alternatives you’ve considered

  • A python tritonserver wheel with ONNXRuntime support that can be installed via pip so that we can use the xx.yy-py3-min image in a Dockerfile to build our own custom image.
  • A tritonserver shell script that will allow us to build tritonserver with ONNXRuntime support without the need to clone the git repository so that we can use the xx.yy-py3-min image in a Dockerfile to build our own custom image.

Additional context

None

Issue Analytics

  • State:closed
  • Created a year ago
  • Reactions:1
  • Comments:5 (4 by maintainers)

github_iconTop GitHub Comments

3reactions
yaysummeriscomingcommented, Jun 29, 2022

+1 to this request. Understand that there’s a lot of combinations to look after, but I think ORT is far more important than say TF2. Triton is not a tool for beginners, and who uses TF2 these days?

Maintaining & hosting a separate Triton build is some work, whereas on your side it’s just another configuration.

1reaction
jbkyang-nvicommented, Jun 29, 2022

@onurcayci

A tritonserver shell script that will allow us to build tritonserver with ONNXRuntime support without the need to clone the git repository so that we can use the xx.yy-py3-min image in a Dockerfile to build our own custom image.

What is the problem of running compose.py? Is it missing functionalities? It seems like you only need to run

git clone --single-branch --depth=1 -b <version number such as r22.05> https://github.com/triton-inference-server/server.git
python3 compose.py --backend onnxruntime --backend python

Is the image taking to much space?

@yaysummeriscoming As you noted, we could have some explosion in maintaining different combinations not to mention all the testing we have to do for each specific image we maintain. Setting up the framework to such support will take some time, or a lot of customer interest to build one specific request. In the meantime, if there is commonality in usage of the python+onnxruntime backends, you and @onurcayci and other members of the community can share your builds on https://www.docker.com/products/docker-hub/.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Triton Inference Server Release 22.05
The Triton Inference Server container image, release 22.05, is available on NGC and is open source on GitHub.
Read more >
High-performance model serving with Triton (preview)
It supports popular machine learning frameworks like TensorFlow, ONNX Runtime, PyTorch, NVIDIA TensorRT, and more.
Read more >
Serve multiple models with Amazon SageMaker and Triton ...
SageMaker invokes the hosting service by running a Docker container. The Docker container launches a RESTful inference server (such as Flask) to ...
Read more >
Deploy a model with #nvidia #triton inference server ...
Deploy a model with #nvidia # triton inference server, #azurevm and # onnxruntime.
Read more >
ONNX Runtime for Azure ML by Microsoft
Container images for ONNX Runtime with different HW execution providers. ... The ONNX Runtime inference engine supports Python, C/C++, C#, Node.js and Java ......
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found