Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Shuffle = False throws error on multi-gpu

See original GitHub issue

when during training time we set shuffle = False, The grouped_batch_sampler through an error of

IndexError: index 0 is out of bounds for dimension 0 with size 0

Issue Analytics

State:
Created 5 years ago
Comments:10 (4 by maintainers)

Top GitHub Comments

1reaction

kakaluotecommented, Aug 12, 2019

i met this problem but shuffle=True. this happened when i change a dataset. help to solve it please. i think it’s a corner case. this is some infomations (i made some prints): dataset: 4492 images batchsize: 8 ( 4 gpus, each one with 2 images ) first 3 gpus print “merged” length is 563, and end elements is tensor([1110, 826]), tensor([3728]), tensor([], dtype=torch.int64)), an empty tensor. last gpu print “merged” length is 562, no empty tensor

0reactions

kakaluotecommented, Aug 12, 2019

in my case, group_ids only have one 1,others are 0, then groups is [0,1], clusters = [(self.group_ids == i) & mask for i in self.groups] when some random, clusters will be [[1,1,1…],[] ] , and the error happened