DataLoader with num_workers > 1, and a Rand[Zoom/Rotate/Flip)d transforms
See original GitHub issueDescribe the bug When using a DataLoader with num_workers > 1, and a Rand[Zoom/Rotate/Flip)d transform, all the data in the multiple workers have the same random state.
To Reproduce
With train_ds
having some random parameterized transforms.
train_loader: DataLoader = DataLoader(
train_ds, # <-- This is a dataset of both the input raw data filenames + definition of transforms
batch_size=1,
shuffle=True,
num_workers=88,
collate_fn=list_data_collate,
)
This is particularly disturbing when running on a machine with 40+ CPUs and huge numbers of images have the same parameter augmentation.
Expected behavior Each transform should have it’s own random parameters chosen, regardless of the number of workers chosen.
Screenshots NOTE: The number of replicated rotation values is always equal to the num_workers specified.
Rotating by 19.367042973517755
Rotating by 19.367042973517755
Rotating by 19.367042973517755
Rotating by 19.367042973517755
Rotating by 4.039486469720721
Rotating by 4.039486469720721
Rotating by 4.039486469720721
Rotating by 4.039486469720721
Rotating by 13.13047017599905
Rotating by 13.13047017599905
Rotating by 13.13047017599905
Rotating by 13.13047017599905
Issue Analytics
- State:
- Created 3 years ago
- Comments:9 (9 by maintainers)
Top Results From Across the Web
In windows, DataLoader with num_workers > 0 is ... - GitHub
Step 1: create two loader, one with num_workers and one without. import torch.utils.data as Data train_loader = Data.DataLoader(dataset= ...
Read more >Guidelines for assigning num_workers to DataLoader ...
I am trying to implement the distributed training from PyTorch examples with 4 GPUs (one sub-process for each GPU), but when I set...
Read more >Complete Guide to the DataLoader Class in PyTorch
This post covers the PyTorch dataloader class. We'll show how to load built-in and custom datasets in PyTorch, plus how to transform and...
Read more >DataLoaders - fastai
DataLoader helpers. fastai includes a replacement for Pytorch's DataLoader which is largely API-compatible, and adds a lot of useful functionality and ...
Read more >How does the "number of workers" parameter in PyTorch ...
1 Answer 1 · When num_workers>0 , only these workers will retrieve data, main process won't. · Well our CPU can usually run...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Hi @hjmjohnson ,
Thanks for your bug report. This is an known issue of “numpy + PyTorch multi-processing”. And you can easily fix it by adding below logic to your DataLoader initialization:
Thanks.
@Nic-Ma @tvercaut I can certainly help with the wiki stuff