Wrong number of 'samples_per_volume' sampled within one epoch
See original GitHub issueIs there an existing issue for this?
- I have searched the existing issues
Bug summary
Hi Fernando,
First of all, thank you very much for adding this functionality (https://github.com/fepegar/torchio/pull/795), and sorry that it took me so long to test it.
I think I found that, during the optimization, Queue is sampling the wrong number of samples per volume. See the example below.
Code for reproduction
import torchio as tio
import numpy as np
from torch.utils.data.dataloader import DataLoader
import torch, random
seed = 42
torch.manual_seed(seed)
torch.cuda.manual_seed(seed)
np.random.seed(seed)
random.seed(seed)
subjects = []
for sub_id in range(1, 11):
params = {
"im": tio.ScalarImage(tensor=np.random.random((1, 320, 320, 10))),
"num_samples": 8,
"info": str(sub_id)
}
subjects.append(tio.Subject(**params))
patch_size = (320, 320, 1)
batch_size = 10
sd = tio.SubjectsDataset(subjects)
sampler = tio.data.UniformSampler(patch_size=patch_size)
queue = tio.Queue(sd, max_length=50, shuffle_patches=True,
samples_per_volume=-1,
sampler=sampler, num_workers=8, shuffle_subjects=True)
tr_loader = DataLoader(queue, batch_size=batch_size, shuffle=False,
pin_memory=False, num_workers=0)
# One epoch of training
for patch in tr_loader:
print(patch["info"])
Actual outcome
[β6β, β2β, β7β, β9β, β7β, β2β, β4β, β4β, β5β, β4β] [β2β, β7β, β2β, β1β, β7β, β5β, β7β, β9β, β7β, β1β] [β5β, β9β, β5β, β4β, β4β, β5β, β1β, β9β, β7β, β4β] [β5β, β9β, β2β, β1β, β9β, β7β, β2β, β1β, β9β, β4β] [β5β, β2β, β1β, β2β, β6β, β1β, β4β, β9β, β1β, β5β] [β10β, β10β, β5β, β2β, β10β, β10β, β8β, β3β, β3β, β3β] [β8β, β10β, β3β, β5β, β2β, β8β, β8β, β8β, β8β, β8β] [β5β, β8β, β2β, β3β, β2β, β5β, β3β, β3β, β3β, β5β]
Error messages
The output of the previous code is the ID of each subject in each of the 8 training iterations.
No error messages, but the output disagrees with my understanding of what should happen. Maybe my understanding is wrong; see below.
Expected outcome
The code above reproduces one epoch of training. Queue size is 50, training set size is 10, and I want to sample 8 patches per subject; in total, one epoch = 80 patches. Thus, the Queue will be loaded 2 times: the first time with 50 patches (first 5 lines of the output), and the second time with 30 patches (last 3 lines of the output).
I expected that each subject is sampled 8 times. However, according to the output shown above, the subjects are sampled differently. For example, subject β6β is sampled only twice, and subject β10β is sampled 5 times (in the second load). Some subjects are sampled the right number of times in the first load of the queue (e.g., β2β and β5β) but then they are resampled in the second load of the queue.
Importantly, if we change the Queue size to 80 (so, all subjects are sampled in one βloadβ of the queue), this problem disappears. I havenβt investigated this but the problem might be the way in which the number of samples per volume is tracked in the queue.
System info
Platform: Linux-4.15.0-142-generic-x86_64-with-glibc2.17
TorchIO: 0.18.84
PyTorch: 1.8.1+cu101
SimpleITK: 2.2.0 (ITK 5.3)
NumPy: 1.23.3
Python: 3.8.9 (default, Apr 3 2021, 01:02:10)
[GCC 5.4.0 20160609]
Issue Analytics
- State:
- Created a year ago
- Comments:7 (7 by maintainers)
Top GitHub Comments
Fixed in #981. Thanks @jmlipman for your detailed report and code to reproduce.
I think I got it. It only took me three hours π