Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Right way to specify GPU memory in DDPPO.

See original GitHub issue

Hi,

What parameters do we need to change in order to utilize all GPU resources?

I’m reproducing the DDPPO results using the single_node.sh script. I have a GPU with 12GB of VRAM as below.

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 390.116                Driver Version: 390.116                   |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX TIT...  Off  | 00000000:05:00.0 Off |                  N/A |
| 22%   59C    P8    30W / 250W |   4501MiB / 12210MiB |     11%      Default |
+-------------------------------+----------------------+----------------------+

The VRAM utilized is only 4GB. There are 4 workers running,

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|    0     17009      G   ...a/miniconda3/envs/habitatapi/bin/python  1382MiB |
|    0     17010      G   ...a/miniconda3/envs/habitatapi/bin/python  1375MiB |
|    0     17011      G   ...a/miniconda3/envs/habitatapi/bin/python  1377MiB |
|    0     17012      G   ...a/miniconda3/envs/habitatapi/bin/python   355MiB |
+-----------------------------------------------------------------------------+

How can I utilize the remaining 8GBs to speed up the training? In the ddppo_pointnav.yaml, NUM_PROCESSES = 4. Should I increase this param or there are other configurations I need to change as well?

Issue Analytics

State:
Created 4 years ago
Comments:7 (2 by maintainers)

Top GitHub Comments

1reaction

erikwijmanscommented, Mar 19, 2020

This all seems correct to me.

For both models I’m getting a framerate of around 50fps and it seems to me a bit low.

The frame rates reported by the training script include the simulation time, the model inference time, and the parameter update time, so it will be considerably lower than just the simulator itself.

1reaction

erikwijmanscommented, Feb 16, 2020

Yes, you can increase the number of processes. With that said, I am surprised that there is no GPU memory being used by CUDA (would be type C), is the model on the CPU? In which case, I would highly recommend using the remaining GPU memory for the model and forward/backward passes.

Top Results From Across the Web

Right way to specify GPU memory in DDPPO. #296 - GitHub

I'm reproducing the DDPPO results using the single_node.sh script. I have a GPU with 12GB ... Right way to specify GPU memory in...

Efficient Training on a Single GPU - Hugging Face

So now we can start training the model and see how the GPU memory consumption changes. First, we set up a few standard...

GPU memory allocation - JAX documentation - Read the Docs

Either use XLA_PYTHON_CLIENT_MEM_FRACTION to give each process an appropriate amount of memory, or set XLA_PYTHON_CLIENT_PREALLOCATE=false . Running JAX and GPU ...

Allocating Memory - Princeton Research Computing

This page explains how to request memory in Slurm scripts and how to deal with common errors involving CPU and GPU memory. Note...

PyTorch 101, Part 4: Memory Management and Using Multiple ...

This article covers PyTorch's advanced GPU management features, how to optimise memory usage and best practises for debugging memory errors.