Warn if `num_workers == 1`
See original GitHub issueCurrently num_workers
can wind up being 1
if a machine has only a single CPU (like a minimal cloud instance). In these cases, it might be worth warning users that they only have 1
worker and so they won’t get the benefits of parallelism with Dask. Of course the messaging/communication may need a bit of thought to find the right balance.
Issue Analytics
- State:
- Created 2 years ago
- Comments:15 (13 by maintainers)
Top Results From Across the Web
Disable UserWarning for DataLoaders with num_workers=0
I don't see a way of knowing if the user set num_workers=0 intentionally. The default setting in the PyTorch DataLoader is 0. Is...
Read more >How does the "number of workers" parameter in PyTorch ...
1 Answer 1 · When num_workers>0 , only these workers will retrieve data, main process won't. · Well our CPU can usually run...
Read more >Speed Up Model Training - PyTorch Lightning - Read the Docs
num_workers=1 means ONLY one worker (just not the main process) will load data, but it will still be slow. The performance of high...
Read more >warning in findblockreaction script from cobra - MATLAB Central
To query the size of an already started parallel pool, query the 'NumWorkers' property of the pool. To check if a pool is...
Read more >torch.utils.data — PyTorch 1.13 documentation
When batch_size (default 1 ) is not None , the data loader yields batched ... when you call enumerate(dataloader) ), num_workers worker processes...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
I initially was pro adding a warning here (this came up after debugging why a user wasn’t seeing parallelism in their code). After reading through the comments above though, I think I’m now against adding a warning at all. I could see the benefit of a debug log message though when the scheduler starts up. Something like:
at the start of
dask.threaded.get
/dask.multiprocessing.get
/dask.local.get_sync
(adjusting the message for each scheduler accordingly).I think what you’re proposing sounds reasonable. Yes, let’s see if other people also want to weigh in.