Why use torch.multiprocessing.spawn for distributed training
See original GitHub issueHi there,
In the Swin UNETR scripts, e.g., https://github.com/Project-MONAI/research-contributions/blob/main/SwinUNETR/BRATS21/main.py, torch.multiprocessing.spawn
is used for launching distributed training. Any reason why you didn’t use torch.distributed.launch
? Did torch.multiprocessing.spawn
give better performance than torch.distributed.launch
for BraTS/BTCV-based Swin UNETR training?
Thanks!
Issue Analytics
- State:
- Created a year ago
- Comments:9 (5 by maintainers)
Top Results From Across the Web
Torch.distributed.launch vs torch.multiprocessing.spawn
If you need multi-server distributed data parallel training, it might be more convenient to use torch.distributed.launch as it automatically ...
Read more >Why using mp.spawn is slower than using torch.distributed ...
mp.spawn is usually slower due to initialization overhead. In general distributed training is long running, so usually the initialization time ...
Read more >Distributed Computing with PyTorch - Shiv Gehlot
Hence, “torch.multiprocessing.spawn” can be used to spawn the training function “fn(”) on each of the GPU through “args”.
Read more >Writing Distributed Applications with PyTorch
torch.distributed ) enables researchers and practitioners to easily parallelize their computations across processes and clusters of machines. · torch.
Read more >Distributed Training Made Easy with PyTorch-Ignite
Then we will also cover several ways of spawning processes via torch native torch.multiprocessing.spawn and also via multiple distributed ...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
I mean, yes, when training with single GPU, the batch size is 1, then train on 2 GPUs, batch size is 2, the time is expected to be longer but should be less than 2 x time of Single GPU for each step/iteration. You could see 2 GPUs training is faster here, but is not exactly 2x faster, it’s ~1.7x faster.
Hi @hw-ju , the SwinUNETR is tested of multi-GPU training with both DDP and MP Spawn. Both works well, no performance preference regarding different multi-GPU frameworks. You can safely use DDP. Thank you!