OWL-ViT memory usage grows linearly with each prediction
See original GitHub issueSystem Info
transformers
version: 4.21.1- Platform: Linux-5.10.102.1-microsoft-standard-WSL2-x86_64-with-glibc2.29
- Python version: 3.8.11
- Huggingface_hub version: 0.8.1
- PyTorch version (GPU?): 1.12.1+cu102 (False)
- Tensorflow version (GPU?): not installed (NA)
- Flax version (CPU?/GPU?/TPU?): not installed (NA)
- Jax version: not installed
- JaxLib version: not installed
- Using GPU in script?: No
- Using distributed or parallel set-up in script?: No
Who can help?
Information
- The official example scripts
- My own modified scripts
Tasks
- An officially supported task in the
examples
folder (such as GLUE/SQuAD, …) - My own task or dataset (give details below)
Reproduction
import torch
from torchvision.datasets import FakeData
from torchvision.transforms.functional import pil_to_tensor
from transformers import OwlViTProcessor, OwlViTForObjectDetection
text_prompts = ["a photo of a cat", "a photo of a dog"]
dataset = FakeData(size=50, image_size=(3, 28, 28), transform=pil_to_tensor)
processor = OwlViTProcessor.from_pretrained("google/owlvit-base-patch32")
model = OwlViTForObjectDetection.from_pretrained("google/owlvit-base-patch32")
target_sizes = torch.Tensor([[28, 28]])
for image, _ in dataset:
inputs = processor(text=[text_prompts], images=image, return_tensors="pt")
outputs = model(**inputs)
_ = processor.post_process(outputs=outputs, target_sizes=target_sizes)[0]
Expected behavior
I expect to be able to generate predictions from the OwlViTForObjectDetection model in a loop without memory usage increasing by ~1GB on each call to the model (line 15). Below, I’ve included a plot of memory usage over time. I profiled the code using memory_profiler
to determine that it is the call to the model (not the processing or post processing) that seems to be the culprit.
Issue Analytics
- State:
- Created a year ago
- Comments:9 (6 by maintainers)
Top Results From Across the Web
OWL-ViT memory usage grows linearly with each prediction
I expect to be able to generate predictions from the OwlViTForObjectDetection model in a loop without memory usage increasing by ~1GB on ...
Read more >OWL-ViT - Hugging Face
In this paper, we propose a strong recipe for transferring image-text models to open-vocabulary object detection. We use a standard Vision Transformer ...
Read more >Repeatedly calling model.predict(...) results in memory leak
I have a sliding window prediction, for every predict memory consumption shoots ups and I have been googling to find a resolution.
Read more >Mostafa Dehghani – Research Scientist at Google Brain
In ViT, we first extract patches from the input image. Then we flatten each patch into a single vector by concatenating the channels...
Read more >Serving ML Models in Production with FastAPI and Celery
This post walks through a working example for serving a ML model using Celery and FastAPI. All code can be found in the...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
@ydshieh so true! Took me a deep dive into PyTorch docs and a while to debug it
We all learn in the hard way, @alaradirik . Just a few weeks ago, @NielsRogge and me had the same issue 😆 Whenever there is PyTorch memory issue -> check
with torch.no_grad()
first.