Shuffle = False throws error on multi-gpu
See original GitHub issuewhen during training time we set shuffle = False, The grouped_batch_sampler through an error of
IndexError: index 0 is out of bounds for dimension 0 with size 0
Issue Analytics
- State:
- Created 5 years ago
- Comments:10 (4 by maintainers)
Top Results From Across the Web
Keras/Tensorflow multi GPU InvalidArgumentError in optimizer
Your issue seems to be similar to the one reported here. It appears that the input data size must be a multiple of...
Read more >Problems with multi-gpus - MATLAB Answers - MathWorks
I can reproduce your issue. It seems the issue is your use of an anonymous function to call a nested function for your...
Read more >Multi-GPU Computing with Pytorch (Draft)
Pytorch provides a few options for mutli-GPU/multi-CPU computing or ... keys will be improperly named and your loader will throw an error.
Read more >Accelerate + Multi-GPU+ Automatic1111 + Dreambooth ...
I noticed during launch that I was getting an error saying that triton ... Shuffle After Epoch = False ... Gradient Checkpointing =...
Read more >Reproducibility — PyTorch 1.13 documentation
First, you can control sources of randomness that can cause multiple ... and to throw an error if an operation is known to...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
i met this problem but shuffle=True. this happened when i change a dataset. help to solve it please. i think it’s a corner case. this is some infomations (i made some prints): dataset: 4492 images batchsize: 8 ( 4 gpus, each one with 2 images ) first 3 gpus print “merged” length is 563, and end elements is tensor([1110, 826]), tensor([3728]), tensor([], dtype=torch.int64)), an empty tensor. last gpu print “merged” length is 562, no empty tensor
in my case, group_ids only have one 1,others are 0, then groups is [0,1],
clusters = [(self.group_ids == i) & mask for i in self.groups]
when some random, clusters will be [[1,1,1…],[] ] , and the error happened