TorchServe ignores batch config properties
See original GitHub issueIn my config.properties
file, I have the lines:
batch_size=4
max_batch_delay=200
I started TorchServe with the command line:
torchserve --start --ts-config config.properties --models d161good=d161good.mar --model-store model_store
When I query the status of the endpoint with curl http://127.0.0.1:8081/models/d161good
, I get:
[
{
"modelName": "d161good",
"modelVersion": "1.0",
"modelUrl": "d161good.mar",
"runtime": "python",
"minWorkers": 12,
"maxWorkers": 12,
"batchSize": 1,
"maxBatchDelay": 100,
"loadedAtStartup": true,
...
Note the "batchSize"
and "maxBatchDelay"
entries.
Issue Analytics
- State:
- Created 4 years ago
- Reactions:1
- Comments:12 (7 by maintainers)
Top Results From Across the Web
3. Batch Inference with TorchServe - PyTorch
TorchServe model configuration: Configure batch_size and max_batch_delay by using the “POST /models” management API or settings in config.properties.
Read more >Deploying EfficientNet Model using TorchServe
config.set_scriptable(True) line is essential. Without it the model won't be able to be compiled with TorchScript. The Custom Handler.
Read more >python - loading model failed in torchserving - Stack Overflow
i am using the model from kaggle. I presume you got the model from https://www.kaggle.com/pytorch/vgg16. I think you are loading the model ...
Read more >An efficient and flexible inference system for serving ... - arXiv
Tensorflow Serving [10] and TorchServe [11]) serve the ... It is well known that the batch size is an important setting. ... fixed...
Read more >Export PyTorch Model to TorchScript | Deploy TorchServe
TorchServe is an easy-to-use, flexible and performant tool for serving and scaling ... + setup runtime properties and manifest properties ...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
@harshbafna If batchSize and max_batch_delay can only be configured only through management API what is the recommendation from Torchserve team to configure this when using multiple replicas in Kubernetes to load these values on container start/restart ?
I found an example for that in torchserve github. https://github.com/pytorch/serve/blob/master/kubernetes/EKS/config.properties.
I hope the above link will be helpful.