Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

DistributedProxySampler RuntimeError when indices are padded

See original GitHub issue

🐛 Bug description

The RuntimeError that occurs in the DistributedProxySampler on line 241 shouldn’t be there since the indices are padded with the full sample which was updated because of this comment.

Environment

PyTorch Version (e.g., 1.4):
Ignite Version (e.g., 0.3.0):
OS (e.g., Linux):
How you installed Ignite (conda, pip, source):
Python version:
Any other relevant information:

Issue Analytics

State:
Created 3 years ago
Comments:6 (3 by maintainers)

Top GitHub Comments

1reaction

ryanwongsacommented, Jul 9, 2020

@vfdev-5 I created PR #1192 with the changes you described above. The test has also been updated to reflect the newer test you described earlier. I am not sure if the PR how you wanted it so would be good to get feedback. Thanks

1reaction

ryanwongsacommented, Jul 9, 2020

Taking the example from the unit test and setting the num_replicas to 8 produces the error

from ignite.distributed.auto import DistributedProxySampler
import torch
from torch.utils.data import WeightedRandomSampler

weights = torch.ones(100)
weights[:50] += 1
num_samples = 100
sampler = WeightedRandomSampler(weights, num_samples)

num_replicas = 8
dist_samplers = [DistributedProxySampler(sampler, num_replicas=num_replicas, rank=i) for i in range(num_replicas)]

torch.manual_seed(0)
true_indices = list(sampler)

indices_per_rank = []
for s in dist_samplers:
    s.set_epoch(0)
    indices_per_rank += list(s)

assert set(indices_per_rank) == set(true_indices)

The error:

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-3-d02cd2dd1018> in <module>
     17 for s in dist_samplers:
     18     s.set_epoch(0)
---> 19     indices_per_rank += list(s)

/opt/conda/lib/python3.7/site-packages/ignite/distributed/auto.py in __iter__(self)
    240 
    241         if len(indices) != self.total_size:
--> 242             raise RuntimeError("{} vs {}".format(len(indices), self.total_size))
    243 
    244         # subsample

RuntimeError: 200 vs 104

The assert will fail too after fixing the RuntimeError but that is because of the padding.