question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Shuffle = False throws error on multi-gpu

See original GitHub issue

when during training time we set shuffle = False, The grouped_batch_sampler through an error of

IndexError: index 0 is out of bounds for dimension 0 with size 0

Issue Analytics

  • State:open
  • Created 5 years ago
  • Comments:10 (4 by maintainers)

github_iconTop GitHub Comments

1reaction
kakaluotecommented, Aug 12, 2019

i met this problem but shuffle=True. this happened when i change a dataset. help to solve it please. i think it’s a corner case. this is some infomations (i made some prints): dataset: 4492 images batchsize: 8 ( 4 gpus, each one with 2 images ) first 3 gpus print “merged” length is 563, and end elements is tensor([1110, 826]), tensor([3728]), tensor([], dtype=torch.int64)), an empty tensor. last gpu print “merged” length is 562, no empty tensor

0reactions
kakaluotecommented, Aug 12, 2019

in my case, group_ids only have one 1,others are 0, then groups is [0,1], clusters = [(self.group_ids == i) & mask for i in self.groups] when some random, clusters will be [[1,1,1…],[] ] , and the error happened

Read more comments on GitHub >

github_iconTop Results From Across the Web

Keras/Tensorflow multi GPU InvalidArgumentError in optimizer
Your issue seems to be similar to the one reported here. It appears that the input data size must be a multiple of...
Read more >
Problems with multi-gpus - MATLAB Answers - MathWorks
I can reproduce your issue. It seems the issue is your use of an anonymous function to call a nested function for your...
Read more >
Multi-GPU Computing with Pytorch (Draft)
Pytorch provides a few options for mutli-GPU/multi-CPU computing or ... keys will be improperly named and your loader will throw an error.
Read more >
Accelerate + Multi-GPU+ Automatic1111 + Dreambooth ...
I noticed during launch that I was getting an error saying that triton ... Shuffle After Epoch = False ... Gradient Checkpointing =...
Read more >
Reproducibility — PyTorch 1.13 documentation
First, you can control sources of randomness that can cause multiple ... and to throw an error if an operation is known to...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found