Option for adding or overriding model config attributes at server startup
See original GitHub issueIs your feature request related to a problem? Please describe. When using Triton server in a deployment with variable hardware configurations (CPU-only for some environments, using GPUs for others) and models stored in S3, we might have to create multiple copies of the model repository just to have multiple versions of the configuration file for each, where we can configure the hardware we want to use for each deployment.
Describe the solution you’d like It would be very helpful to have a config override flag like this:
tritonserver --model-store=s3://my-bucket/my-model-repository --config-override=/mnt/config.pbtxt
This way, I can put most attributes such as the model platform and the inputs/outputs in the config.pbtxt
stored in S3, then put attributes such as the instance groups and batch size configuration in the config.pbtxt
provided to the server at deployment time.
Describe alternatives you’ve considered Alternative solutions I’ve considered include:
- Adding a custom
rclone
command to load the models from S3 then overwrite the config attributes before running the server and pointing to the model repository that’s now available locally - Having multiple copies of the model (and the configuration alongside it), but this is far from an ideal solution to be honest
Additional context
I’m targeting a Kubernetes deployment where the Triton server is deployed using a custom Helm chart. It’s common practice to provide hardware requirements when deploying the chart, and with it, I’m hoping to also pass attributes such as instance_group
and have them reflected in my deployment 😃
Thank you!
Issue Analytics
- State:
- Created a year ago
- Reactions:8
- Comments:8 (3 by maintainers)
Top GitHub Comments
Thank you for your detailed ticket, Ashraf. I filed an enhancement request for this.
No updates. I’ve pinged those leading prioritization and mentioned your urgency. Keep in mind that there’s not time to get this in for 22.07, so it would be included in 22.08 at the earliest. (Of course, the repo is public, so you’d be welcome to build with the PR yourself, if/once this feature is merged.)