question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Failed to download CelebA dataset using download=True

See original GitHub issue

🐛 Bug

It fails to download the following files

  1. img_align_celeba.zip

Rather than the zip file, it downloads a html file “Google Drive - Quota exceeded”. Returns badZipFile error

  1. list_attr_celeba.txt

Similarly, “Google Drive - Quota exceeded”. This time it returns RuntimeError(‘Dataset not found or corrupted.’ + ’ You can use download=True to download it’)

  1. list_landmarks_align_celeba.txt

Similar to number 2

To Reproduce

Steps to reproduce the behavior:

  1. train_dataset = datasets.CelebA('data', split="train", transform=transforms.ToTensor(), download=True)

Expected behavior

Environment

PyTorch version: 1.2.0 Is debug build: No CUDA used to build PyTorch: 10.0

OS: Microsoft Windows 10 Home Single Language GCC version: (x86_64-posix-seh-rev0, Built by MinGW-W64 project) 8.1.0 CMake version: Could not collect

Python version: 3.7 Is CUDA available: Yes CUDA runtime version: 10.0.130 GPU models and configuration: Could not collect Nvidia driver version: Could not collect cuDNN version: Could not collect

Versions of relevant libraries: [pip3] numpy==1.17.0 [pip3] torch==1.2.0 [pip3] torchtext==0.4.0 [pip3] torchvision==0.4.0 [conda] Could not collect

Additional context

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Reactions:6
  • Comments:16 (2 by maintainers)

github_iconTop GitHub Comments

5reactions
pmeiercommented, Feb 27, 2020

The error message

Google Drive - Quota exceeded

means, that the traffic of this file (size and number of downloads) exceeds a limit or quota set by Google Drive. Since we are not hosting the dataset we have no chance to help you with this, since this is not an error on our side. According to the answer in the above link this quota is reset every 24 hours, so a possible fix for you might be to try again later and hope that the traffic limit is not reached yet.

3reactions
MohanadOdemacommented, Jun 1, 2021

Can I just point out a workaround that worked for me rather trying my luck every 24 hours.

The needed files for celeba dataset, as defined in the filelist in torchvision’s CelebA class, are as follows:

img_align_celeba.zip, list_attr_celeba.txt, identity_CelebA.txt, list_bbox_celeba.txt, list_landmarks_align_celeba.txt, list_eval_partition.txt

I downloaded them directly from the authors’ google drive link here, and placed them in the path: {root}/celeba

where root is the directory you specify when calling the CelebA class

Read more comments on GitHub >

github_iconTop Results From Across the Web

python - Error downloading celebA dataset using torchvision
Using the torchvision module datasets, I can't download the celebA image dataset. I am pretty sure that I am doing everything right.
Read more >
CelebA dataset download errors - vision
I am having issues downloading the CelebA dataset. It appears that some of the data is not in .zip format which is throwing...
Read more >
Gender Classification and Eyes Location Detection
CelebA (data_root, download=True) . However, due to the high traffic on the dataset's Google Drive (the main source of the dataset), it usually...
Read more >
Large-scale CelebFaces Attributes (CelebA) Dataset
CelebFaces Attributes Dataset (CelebA) is a large-scale face attributes dataset with more than 200K celebrity images, each with 40 attribute annotations.
Read more >
Complete Guide to the DataLoader Class in PyTorch
We'll show how to load built-in and custom datasets in PyTorch, ... DataLoader( MNIST( '~/mnist_data', train=True, download=True, transform = transforms.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found