Shuffling automatically set to True in Config API - not compatible with iterable datasets
See original GitHub issue🐛 Bug Report
In the config API it appears that it is assumed that the train dataloader should be shuffled, as per the following line: https://github.com/catalyst-team/catalyst/blob/a7bc302a762d7d9f462ded6d9cd6ae70f8b656aa/catalyst/utils/data.py#L201
This behaviour is particularly undesirable when using iterable datasets, as they are incompatible with shuffle=True. It would probably be better to let the user specify the desired value of shuffle in the config.yml file.
At the moment, if the user passes:
loaders_params: {"shuffle": False}
it is overwritten in the mentioned line, which leads to the pytorch error:
ValueError: DataLoader with IterableDataset: expected unspecified shuffle option, but got shuffle=True
How To Reproduce
Steps to reproduce the behavior:
- Create an iterable dataset that is created in customRunner.get_datasets()
- Use the config API to create a loader for the dataset with e.g. the following params:
loaders: &loaders
batch_size: None
num_workers: 0
drop_last: False
per_gpu_scaling: False
loaders_params: {"shuffle": False}
- See the following pytorch error:
ValueError: DataLoader with IterableDataset: expected unspecified shuffle option, but got shuffle=True
Expected behavior
When passing
loaders_params: {"shuffle": False}
one would expect shuffling to be turned off for both loaders.
Environment
Catalyst version: 21.03.2
PyTorch version: 1.7.1
TensorFlow version: N/A
TensorBoard version: 2.4.1
OS: Ubuntu 16.04.6 LTS
Python version: 3.7
Nvidia driver version: 460.32.03
Checklist
- [ x] bug description
- [ x] steps to reproduce
- [ x] expected behavior
- [ x] environment
- code sample / screenshots
Issue Analytics
- State:
- Created 2 years ago
- Comments:5 (3 by maintainers)
Top GitHub Comments
Oh, I see, as a possible workaround:
Nevetheless, it would be great if you could inject a hotfix here.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.