How to free up the CUDA memory
See original GitHub issueI just wanted to build a model to see how pytorch-lightning works. I am working on jupyter notebook and I stopped the cell in the middle of training. I wanted to free up the CUDA memory and couldn’t find a proper way to do that without restarting the kernel. Here I tried these:
del model # model is a pl.LightningModule
del trainer # pl.Trainer
del train_loader # torch DataLoader
torch.cuda.empty_cache()
# this is also stuck
pytorch_lightning.utilities.memory.garbage_collection_cuda()
Deleting model and torch.cuda.empty_cache()
works in PyTorch.
- Version 0.9.0
Issue Analytics
- State:
- Created 3 years ago
- Reactions:2
- Comments:8 (5 by maintainers)
Top Results From Across the Web
Clearing GPU Memory - PyTorch - Beginner (2018)
follow it up with torch.cuda.empty_cache(). This will allow the reusable memory to be freed (You may have read that pytorch reuses memory ......
Read more >How can we release GPU memory cache? - PyTorch Forums
torch. cuda. empty_cache() (EDITED: fixed function name) will release all the GPU memory cache that can be freed.
Read more >How can I flush GPU memory using CUDA (physical reset is ...
Quitting applications seems to free the memory they use. Quit everything you don't need, or quit applications one-by-one to see how much memory...
Read more >Solving "CUDA out of memory" Error - Kaggle
1) Use this code to see memory usage (it requires internet to install package): · 2) Use this code to clear your memory:...
Read more >How to free up the CUDA memory · Issue #3275 - GitHub
I wanted to free up the CUDA memory and couldn't find a proper way to do that without restarting the kernel. Here I...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
I think
is all you need.
yep, I think that is because our subprocess does not get killed properly for these signals. been working on this in #2165, I’ll check it also on jupyter/colab once the refactors are done and I can finish this PR. I am fairly confident that this is related and #2165 fixes it, but not 100% sure.