Triton Server Docker Image with ONNXRuntime support
See original GitHub issueIs your feature request related to a problem? Please describe.
Currently, we are using the full tritonserver docker image (i.e. xx.yy-py3
) in order to use our ONNXRuntime models. However, the image size is too large for us and we would like to decrease it by having a tritonserver docker image that only supports ONNXRuntime models, similar to how there is xx.yy-tf2-python-py3
for TensorFlow 2.x and xx.yy-pyt-python-py3
for PyTorch.
One constraint we have is that we do not want to manually manage the tritonserver
image updates by cloning the git repository and using the compose.py
file in the near future. That is why we prefer to have an official image from Nvidia’s Container Registry.
Describe the solution you’d like
- An officially supported tritonserver docker image with ONNXRuntime and Python backends only.
Describe alternatives you’ve considered
- A python
tritonserver
wheel with ONNXRuntime support that can be installed viapip
so that we can use the xx.yy-py3-min image in a Dockerfile to build our own custom image. - A
tritonserver
shell script that will allow us to buildtritonserver
with ONNXRuntime support without the need to clone the git repository so that we can use the xx.yy-py3-min image in a Dockerfile to build our own custom image.
Additional context
None
Issue Analytics
- State:
- Created a year ago
- Reactions:1
- Comments:5 (4 by maintainers)
Top GitHub Comments
+1 to this request. Understand that there’s a lot of combinations to look after, but I think ORT is far more important than say TF2. Triton is not a tool for beginners, and who uses TF2 these days?
Maintaining & hosting a separate Triton build is some work, whereas on your side it’s just another configuration.
@onurcayci
What is the problem of running
compose.py
? Is it missing functionalities? It seems like you only need to runIs the image taking to much space?
@yaysummeriscoming As you noted, we could have some explosion in maintaining different combinations not to mention all the testing we have to do for each specific image we maintain. Setting up the framework to such support will take some time, or a lot of customer interest to build one specific request. In the meantime, if there is commonality in usage of the python+onnxruntime backends, you and @onurcayci and other members of the community can share your builds on https://www.docker.com/products/docker-hub/.