Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Production - Memory Leak?

See original GitHub issue

Is anyone serving a Classifier in production? I’m using flask and upon every call to predict I am experiencing an increase in memory usage by the process until OOM error in the pod on Kubernetes, I only invoke this class once and use the same instance. This even happens for batch size 1

class Transformer:
    def __init__(
        self,
        dir = 'model/',
    ):
        self.model = MultiLabelClassificationModel(
            'roberta', dir, use_cuda=False, args={'no_cache': True, 'use_cached_eval_features':False},
        )
        self.mlb=MultiLabelBinarizer()
        self.mlb.fit([['A', 'B', 'C', 'D']])
    
    def predict(self, batch,):
        predictions, raw_outputs = self.model.predict(batch)
        labels = self.mlb.inverse_transform(np.asarray(predictions))
        return labels

Issue Analytics

State:
Created 3 years ago
Comments:8 (4 by maintainers)

Top GitHub Comments

3reactions

ximo1984commented, Jun 18, 2020

I have more info. The problem is with multiprocessing. When I run flask it works perfectly but when I run with gunicorn I have to deactivate ‘use_multiprocessing’ to work well.

Change this (this works with flask server but not with gunicorn): model = ClassificationModel(‘bert’, ‘outputs’, num_labels=11, use_cuda=False, args={‘silent’: True})

To this (this works for both): model = ClassificationModel(‘bert’, ‘outputs’, num_labels=11, use_cuda=False, args={‘silent’: True, ‘use_multiprocessing’: False})

0reactions

hassant4commented, Jun 18, 2020

I am running flask without Gunicorn, and it runs fine for me, I know it gives the warning that it isn’t production grade, but I’m sure I’ve processed more that 10k+ requests at this stage while live for at least a month.

And just an update on my memory issues from OP: for batch sizes 1, my instances converged to around 2.9GB. And for batch sizes 300, it did also converge to around 5-6GB so it was ok I guess. My problems were that my initial 2GB instances were too weak.