Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Very slow inference in 0.5.11

See original GitHub issue

After training a default classifier, saving and loading. model.predict("lorem ipsum") and model.predict_prob take in average 14 seconds even on a hefty server such as AWS p3.16Xlarge.

Issue Analytics

State:
Created 5 years ago
Comments:17 (17 by maintainers)

Top GitHub Comments

3reactions

dimiddcommented, Nov 29, 2018

Thanks, for my use case (serving a model as an api), a contextmanager doesn’t fit, since I need to call predict after an external event (e.g. an http request), so I’m just calling _cached_inference directly. Anyhow, I think we can finally close this issue. Thanks a lot for your great work!

1reaction

madisonmaycommented, Nov 22, 2018

Hi @dimidd,

Thanks for checking back in! Although I was hoping to end up with a solution where we could have our metaphorical cake and eat it too, we ran into some limitations with how tensorflow handles cleaning up memory that meant we had to opt for a more explicit interface for prediction if you want to avoid rebuilding the graph: https://finetune.indico.io/#prediction

model = Classifier()
model.fit(train_data, train_labels)
with model.cached_predict():
    model.predict(test_data) # triggers prediction graph construction
    model.predict(test_data) # graph is already cached, so subsequence calls are faster

Let me know if this solution works for you!