question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. ItĀ collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Concatenate torchvision.datasets.FakeData with another dataset -> cannot load it

See original GitHub issue

šŸ› Bug

If you concatenate a dataset such as CIFAR10 with FakeData, you get error

  • AttributeError: ā€˜int’ object has no attribute ā€˜numel’

To Reproduce

Steps to reproduce the behavior:

  1. cifar_dataset = torchvision.datasets.CIFAR10(...)
  2. fake_dataset = torchvision.datasets.FakeData(...)
  3. train_data = Concat([cifar_dataset, fake_dataset])
  4. train_loader = DataLoader(train_data, ...)
  5. for data in train_loader then error

Additional context

The reason why it happens is the labels in CIFAR10 are int and labels in FakeData are tensors. When concatenating them to construct a batch, the batch labels look like [0,1,2,3,tensor(0),3,4,5,6,tensor(2)…].

I can solve this bug by letting target_transform=int when I load fake_dataset. However, this is very hard to debug. I assume that the default target type in the FakeData source code should be set to int instead of long tensor.

Here: https://pytorch.org/vision/0.8/_modules/torchvision/datasets/fakedata.html#FakeData in function __getitem__ target = torch.randint(0, self.num_classes, size=(1,), dtype=torch.long)[0] It’s long tensor. It should be int.

cc @pmeier @fmassa @vfdev-5

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:5 (3 by maintainers)

github_iconTop GitHub Comments

1reaction
pmeiercommented, Mar 22, 2021

Indeed it should. If the PR contains a certain keyword together with the issue number, GitHub will close the issue automatically when the PR is merged.

You used a keyword in your PR that is not recognized by GitHub:

solves #3517

0reactions
avijit9commented, Mar 22, 2021

@pmeier Shouldn’t this issue be closed?

Read more comments on GitHub >

github_iconTop Results From Across the Web

Detection Datasets in Torchvision Ā· Issue #3047 Ā· pytorch/vision
We cannot restrict dataset to load only in COCO format or VOC Format, different models need different format, and torchvision provides datasets ......
Read more >
Why don't the images align when concatenating two data sets ...
Image object and not a tensor. Therefore, you cannot use torch.equal to compare two PIL.Image objects. Try instead: train = torchvision.
Read more >
torchvision.datasets - PyTorch
torchvision.datasets. All datasets are subclasses of torch.utils.data.Dataset i.e, they have __getitem__ and __len__ methods implemented.
Read more >
Manipulating Pytorch Datasets - Medium
Filter class from Pytorch Dataset; Concatenate Pytorch Datasets ... to store the data: Dataset and Dataloader; torchvision provides pre-loaded datasets,Ā ...
Read more >
Extending datasets in pyTorch - Laurent Perrinet
You can in a few lines of codes retrieve a dataset, define your model, ... from torchvision.datasets import MNIST import torch fromĀ ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found