Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Concatenate torchvision.datasets.FakeData with another dataset -> cannot load it

See original GitHub issue

🐛 Bug

If you concatenate a dataset such as CIFAR10 with FakeData, you get error

AttributeError: ‘int’ object has no attribute ‘numel’

To Reproduce

Steps to reproduce the behavior:

cifar_dataset = torchvision.datasets.CIFAR10(...)
fake_dataset = torchvision.datasets.FakeData(...)
train_data = Concat([cifar_dataset, fake_dataset])
train_loader = DataLoader(train_data, ...)
for data in train_loader then error

Additional context

The reason why it happens is the labels in CIFAR10 are int and labels in FakeData are tensors. When concatenating them to construct a batch, the batch labels look like [0,1,2,3,tensor(0),3,4,5,6,tensor(2)…].

I can solve this bug by letting target_transform=int when I load fake_dataset. However, this is very hard to debug. I assume that the default target type in the FakeData source code should be set to int instead of long tensor.

Here: https://pytorch.org/vision/0.8/_modules/torchvision/datasets/fakedata.html#FakeData in function __getitem__ target = torch.randint(0, self.num_classes, size=(1,), dtype=torch.long)[0] It’s long tensor. It should be int.

cc @pmeier @fmassa @vfdev-5

Issue Analytics

State:
Created 3 years ago
Comments:5 (3 by maintainers)

Top GitHub Comments

1reaction

pmeiercommented, Mar 22, 2021

Indeed it should. If the PR contains a certain keyword together with the issue number, GitHub will close the issue automatically when the PR is merged.

You used a keyword in your PR that is not recognized by GitHub:

solves #3517

0reactions

avijit9commented, Mar 22, 2021

@pmeier Shouldn’t this issue be closed?

Top Results From Across the Web

Detection Datasets in Torchvision · Issue #3047 · pytorch/vision

We cannot restrict dataset to load only in COCO format or VOC Format, different models need different format, and torchvision provides datasets ......

Why don't the images align when concatenating two data sets ...

Image object and not a tensor. Therefore, you cannot use torch.equal to compare two PIL.Image objects. Try instead: train = torchvision.

torchvision.datasets - PyTorch

torchvision.datasets. All datasets are subclasses of torch.utils.data.Dataset i.e, they have __getitem__ and __len__ methods implemented.

Manipulating Pytorch Datasets - Medium

Filter class from Pytorch Dataset; Concatenate Pytorch Datasets ... to store the data: Dataset and Dataloader; torchvision provides pre-loaded datasets, ...

Extending datasets in pyTorch - Laurent Perrinet

You can in a few lines of codes retrieve a dataset, define your model, ... from torchvision.datasets import MNIST import torch from ...