the speed of prediction get much slower after its deployment on the server
See original GitHub issueMy model is converted from Keras. The speed is normal, less than 1s when I test it on localhost. But after I deployed it on my remote Ubuntu server using tomcat(https), the speed of prediction is much slower than localhost, nearly 10s.
const tensor = tf.tensor(data,[1,93],"int32");
let value = sentiment_predict(tensor);
async function sentiment_predict(tensor){
const model = await tf.loadLayersModel(MODEL_URL);
let result = model.predict(tensor).toString();
let num = result.split('[')[2].split(',');
let neg = parseFloat(num[0]);
let pos = parseFloat(num[1]);
return pos - neg;
}
Maybe because the size of model (6.3M) ? Maybe because I load the model every time I want to predict? So why that and what should I do to? I’ll appreciate it if you could help me.
Issue Analytics
- State:
- Created 4 years ago
- Comments:6
Top Results From Across the Web
Minimizing real-time prediction serving latency in machine ...
An ML model is useful only if it's deployed and ready to make predictions, but building an adapted ML serving system requires the...
Read more >Why does keras model predict slower after compile?
Yes, both are possible, and it will depend on (1) data size; (2) model size; (3) hardware. Code at the bottom actually shows...
Read more >Optimizing Models for Deployment and Inference - neptune.ai
Batches of images of products could be taken simultaneously to the prediction, and thus sequentially inferencing each image would actually slow ...
Read more >There are two very different ways to deploy ML models, here's ...
It can make calls to a backend server to get results, which it then maybe ... and storing models or predictions to the...
Read more >Performance Guide | TFX - TensorFlow
This is due to a better potential for multi-tenant deployment to utilize the hardware and lower fixed costs (RPC server, TensorFlow runtime, etc ......
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Sounds like you might be trying to make a prediction before the model has finished loading.
Yeah, you are right. And I try this way, the problem has basically been solved. Thanks.