question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

If you OOM once you OOM forever

See original GitHub issue

Assume while using inductor that you have batch sizes b_1 and b_2 where b_1 < b2 and If you run b_1 first and if the model doesn’t OOM and then run b_2 then it OOMs.

The problem is if you run b_2 first and it OOMS and then if you run b_1 it does also OOM even though it’s not supposed to.

Potentially using dynamo.reset() might help until the OOM issues are all fixed https://github.com/pytorch/pytorch/blob/master/torch/_dynamo/__init__.py#L31

Issue Analytics

  • State:open
  • Created a year ago
  • Comments:8 (8 by maintainers)

github_iconTop GitHub Comments

2reactions
yanboliangcommented, Oct 28, 2022

I can reproduce this locally. If I print out torch.cuda.memory_allocated() during each iteration, this is what dynamo looks like:

batch size =  32
239657472
batch size =  16
467977216
batch size =  8
694494720
batch size =  4
923125760
batch size =  3
1150851584
batch size =  2
1378315264
batch size =  1
1604861440

However, this is native PyTorch:

batch size =  32
239657472
batch size =  16
233169408
batch size =  8
239559168
batch size =  4
232202752
batch size =  3
240456192
batch size =  2
232194560
batch size =  1
241365504

It seems dynamo doesn’t free some memory after each iteration, so the memory keeps growing.

0reactions
yanboliangcommented, Nov 15, 2022

SG, let me check if I can reproduce with this repro.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Using Personal Meeting ID (PMI) - Zoom Support
Your Personal Meeting Room is a virtual meeting room permanently reserved for you that you can access with your Personal Meeting ID (PMI)......
Read more >
The Psychology of Zoom Fatigue - The Atlantic
By now, you have no doubt heard of “Zoom fatigue,” the range of maladies, including exhaustion and headaches, that are associated with hours...
Read more >
The most common Zoom problems and how to fix them now
Solution 3: Test the audio and video. If your webcam or audio issues persist, you can test your audio and video in Zoom...
Read more >
HOW TO Keep the Same Meeting ID in Zoom! - YouTube
As with any task, setting up dozens of Zoom meetings every week for your company or clients can get exhausting, especially when you...
Read more >
Never click on this kind of Zoom invite. You'll thank us forever
To avoid falling for this Zoom phishing scam, the BBB advises the following: always check to see that the message is coming from...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found