Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

VRAM use grows when using SentenceTransformer.encode + potential fix.

See original GitHub issue

Hey,

I have been using this repository to obtain sentence embeddings for a data set I am currently working on. When using SentenceTransformer.encode, I noticed that my VRAM usage grows with time until a CUDA out of memory error is raised. Through my own experiments I have found the following:

detaching the embeddings before they are extended to all_embeddings, using: embeddings = embeddings.to("cpu") greatly reduces this growth.
Even with the above line added the VRAM use grew, albiet slowy, by adding the line torch.cuda.empty_cache() after the above VRAM usage appears to stop growing over time. The first point makes sense as a fix but I am unsure why this line is necessary?

I am using: pytorch 1.6.0, transformers 3.3.1, sentence_transformers 0.3.7.

Have I missed something in the docs or am doing something daft? I am happy to submit a pull request if needs be?

Thanks,

Martin

Issue Analytics

State:
Created 3 years ago
Reactions:2
Comments:21 (9 by maintainers)

Top GitHub Comments

2reactions

nreimerscommented, Oct 9, 2020

Agree, when convert_to_numpy=True, I will change the code so that detach and cpu() happens in the loop, not afterwards.

1reaction

liaocs2008commented, Jun 9, 2021

I got same observation. Current code would still crash because of OOM. I managed to fix it by adding following lines for each iteration at https://github.com/UKPLab/sentence-transformers/blob/master/sentence_transformers/SentenceTransformer.py#L188: “”" del embeddings torch.cuda.empty_cache() “”"

Top Results From Across the Web

SentenceTransformer — Sentence-Transformers documentation

Loads or create a SentenceTransformer model, that can be used to map sentences / text to embeddings. ... batch_size – Encode sentences with...

Computing Sentence Embeddings

If it is not a path, it first tries to download a pre-trained SentenceTransformer model. If that fails, tries to construct a model...

Quickstart — Sentence-Transformers documentation

Once you have SentenceTransformers installed, the usage is simple: ... encoded by calling model.encode() emb1 = model.encode("This is a red cat with a...

SentenceTransformers Documentation — Sentence ...

You can use this framework to compute sentence / text embeddings for more than ... SentenceTransformer('all-MiniLM-L6-v2') #Our sentences we like to encode ......

Training Overview — Sentence-Transformers documentation

In the quick start & usage examples, we used pre-trained SentenceTransformer models that already come with a BERT layer and a pooling layer....