question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Memory estimation inconsistent with actual GPU memory utilization

See original GitHub issue

Describe the bug Memory estimation inconsistent with actual GPU memory utilization

To Reproduce

  • I am using a simple UNet with 2 layers (same as here).
  • The input size is (1, 1, 4096, 3328)

Expected behavior When forwarding an image of size (1, 1, 4096, 3328) in testing mode, i.e., model.eval() on, the reported GPU memory is approximatly 15GB:

Screenshot from 2022-07-01 10-44-52

However, torchinfo.summary reports 50GB, even though eval is passed as argument:

summary(model, input_size=(1, 1, 4096, 3328), mode='eval', device=device)

Screenshot from 2022-07-01 10-46-48

Issue Analytics

  • State:open
  • Created a year ago
  • Comments:7 (3 by maintainers)

github_iconTop GitHub Comments

1reaction
mert-kurttutancommented, Oct 13, 2022

Actually, gradient is not calculated in any of the modes since torch.no_grad is used for both train and eval mode, see forward_pass function in torchinfo.py. I also checked. Gpu memory usage remains the same when changing the mode.

1reaction
rodrigovimieirocommented, Sep 25, 2022

@devrimcavusoglu I don’t have enough GPU memory for the model. That’s why I was trying to estimate it

Read more comments on GitHub >

github_iconTop Results From Across the Web

Memory estimates for recursive models are inconsistent with ...
Memory estimates are inconsistent with actual GPU usage for recursive models. To Reproduce Example code: import torch from torch.nn import ...
Read more >
7 Tested Methods to Fix Your GPU Memory is Full Message
1. Adjust paging file settings for the game drive · 2. Update the graphics driver · 3. Use the 3GB switch · 4....
Read more >
Task Manager Has Shown VRAM USED not ALLOCATED ...
The 'Dedicated GPU memory usage' tab on task manager (CTRL+SHIFT+ESC) is pulling from VidMm, which is the OS information on usage.
Read more >
Why is gpu device used not consistent with log info?
Yes, its normal that if a GPU has all of its memory utilized (however that came about), and then your process tries to...
Read more >
Monitoring GPU utilization for Deep Learning - Paperspace Blog
An out of memory means the GPU has run out of resources that it can allocate for the assigned task. This error often...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found