Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

DataParallel is used by auto_model with single GPU

See original GitHub issue

🐛 Bug description

I am not sure whether it is a bug or a feature:

The DataParallel is being applied/patched by idist.auto_model in the context of a single gpu (backend=None, nproc_per_node=1). What is the reason behind this choice? Does it bring any speed improvements?

The only way to prevent it is to set os.environ["CUDA_VISIBLE_DEVICES"] = "0" for single-gpu contexts.

Environment

PyTorch Version (e.g., 1.4): 1.7.1
Ignite Version (e.g., 0.3.0): 0.4.8
OS (e.g., Linux): Linux
How you installed Ignite (conda, pip, source): pip
Python version: 3.8

Issue Analytics

State:
Created 2 years ago
Comments:9 (4 by maintainers)

Top GitHub Comments

1reaction

vfdev-5commented, Feb 3, 2022

Can we maybe check the world_size?

yes, we are using world_size to setup DDP. If world_size is defined and >1 then there is a distributed processing group and there is no point to use DP.

Here is the code: https://github.com/pytorch/ignite/blob/6d83dd72bb0bb7e655cd284789f367b46ab36a9e/ignite/distributed/auto.py#L201-L230

If there is no distributed processing group, but we have more then one GPUs available, we can use DP.

To enable distributed processing group, user can specify the backend in idist.Parallel and a group will be automatically created using all available process.

Is it justified to keep this use case, now that DDP is out and is faster than DP? I don’t fully understand the different use cases, so I might be wrong, in which case I understand that we should not change this behaviour

In our case we leave the decision to the user. By launching a single process (python main.py) on a machine with N GPUs, he/she can either stay with a single process and use DP or spawn N sub-processes (and ignite internally creates a dist process group and thus auto_model will use DDP).

0reactions

H4dr1encommented, Feb 3, 2022

Yes, it makes perfectly sense, but how ignite can know that you have nproc_per_node=1 ?

Can we maybe check the world_size?

In addition, there can be (old) cases when we would like to use DP : one process and use multiple GPUs.

Is it justified to keep this use case, now that DDP is out and is faster than DP? I don’t fully understand the different use cases, so I might be wrong, in which case I understand that we should not change this behaviour