How to cache inferences with torchserve
See original GitHub issueReference architecture showcasing how to cache inferences from torchserve
So potentially the inference
handler would reach from some cloud cache or KV store
The benefit of this is it’d dramatically reduce latency for common queries
Probably a good level 3-4 bootcamp task for a specific kind of KV store like Redis or specific cloud cache in AWS.
Issue Analytics
- State:
- Created 2 years ago
- Comments:9 (4 by maintainers)
Top Results From Across the Web
12. Running TorchServe — PyTorch/Serve master ...
TorchServe can be used for many types of inference in production settings. It provides an easy-to-use command line interface and utilizes REST based...
Read more >Serving PyTorch models with TorchServe | by Álvaro Bartolomé
TorchServe is the ML model serving framework developed by PyTorch. This post explains how to train and serve a CNN transfer learning model....
Read more >BERT TorchServe Tutorial — AWS Neuron Documentation
This tutorial demonstrates the use of TorchServe with Neuron, ... Download the custom handler script that will eventually respond to inference requests.
Read more >PyTorch - KServe Documentation Website
In this example, we use a trained pytorch mnist model to predict handwritten digits by running an inference service with TorchServe predictor.
Read more >Deploying EfficientNet Model using TorchServe
For more information on batch inference with TorchServe, please refer to ... cache/b4.pt --handler handler.py \ --export-path model-store ...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Closing for now since this could be handled outside of torchserve more easily
That is correct, if you have a reference example we can add the example for it in the repo