Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

CUDA runs out of memory

See original GitHub issue

Has anyone run into a running out of GPU memory issue when running the imagine command? Below is the error I get.

RuntimeError: CUDA out of memory. Tried to allocate 2.00 MiB (GPU 0; 6.00 GiB total capacity; 4.47 GiB already allocated; 716.80 KiB free; 4.48 GiB reserved in total by PyTorch)

I tried to use both gc.collect() and torch.cuda.empty_cache() but neither worked.

Issue Analytics

State:
Created 3 years ago
Comments:6

Top GitHub Comments

5reactions

afiaka87commented, Jan 23, 2021

Has anyone run into a running out of GPU memory issue when running the imagine command? Below is the error I get.
RuntimeError: CUDA out of memory. Tried to allocate 2.00 MiB (GPU 0; **6.00 GiB** total capacity; 4.47 GiB already allocated; 716.80 KiB free; 4.48 GiB reserved in total by PyTorch)
I tried to use both gc.collect() and torch.cuda.empty_cache() but neither worked.

Edit: Just realized you’re working with 6 GiB of VRAM. Have you considered using the google colab notebook instead? If it’s your first time they tend to give you a decent GPU with 16 GiB of memory. Not to mention it’s free (unless you’re using it alot).

You can check your GPU’s memory usage with nvidia’s CLI tool nvidia-smi which is provided with the cuda toolkit.

This unfortunately comes with the territory. The code runs best on a graphics card with 16 GiB. If you’ve got less than that, here are some parameters you can change to lower your VRAM usage.

decrease `--image_width`

This one lowers memory usage a lot. I’ve even done this on a 16 GiB colab instance just so I could run 64 hidden layers on a 256px image. Just keep in mind that 512 is a decent default. You’ll probably want to decrease in multiples of 8 I assume, but I dunno, maybe it doesn’t matter. At any rate the obvious tradeoff here is that you’ll get a less detailed output.

lower `--batch_size` (default is 4).

increase `--gradient_accumulate_every` (default is 4).

This will generate more images before calculating the loss and running backpropagation (the memory intensive bit). Each images loss is divided by the e.g. the default, 4, because we don’t want to punish the network without giving it a chance to change.

decrease `--num_layers` (default is 32)

This one is basically a requirement on a GPU with less than 16 GiB of memory. The default of 32 is meant for Colab users and is honestly a bit high considering the consumer GPU space doesn’t tend to have cards more than 8 GiB of vram. Lowering to 16 will get you below 8 GiB of vram but the results will be more abstract and silly. If you do decrease this value make sure you only do it as much as you need and no more, because more hidden layers seems to help quite a bit.

0reactions

alien-einsteincommented, Jun 28, 2021

Hello i get this strange error the hole time ,but there is still spaces on my gpu ram left . I tried using --numlayers 16 but the error is still the same. And i used --batch_size 2 but they both didn`t helped me

RuntimeError: CUDA out of memory. Tried to allocate 128.00 MiB (GPU 0; 4.00 GiB total capacity; 2.47 GiB already allocated; 0 bytes free; 2.49 GiB reserved in total by PyTorch)