question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Unable to load CelebA dataset. File is not zip file error.

See original GitHub issue

🐛 Bug

Unable to download and load celeba dataset into a loader.

To Reproduce

  1. Try to load CeleBA dataset with download true returns error
batch_size=25
train_loader = torch.utils.data.DataLoader(
        datasets.CelebA('../data', split="train", download=True,
                       transform=transforms.Compose([
                           transforms.ToTensor(),
                           transforms.Normalize((0.5,), (0.5,))
                       ])),
        batch_size=batch_size, shuffle=True)

Returns

/usr/local/lib/python3.6/dist-packages/torchvision/datasets/celeba.py in __init__(self, root, split, target_type, transform, target_transform, download)
     64 
     65         if download:
---> 66             self.download()
     67 
     68         if not self._check_integrity():

/usr/local/lib/python3.6/dist-packages/torchvision/datasets/celeba.py in download(self)
    118             download_file_from_google_drive(file_id, os.path.join(self.root, self.base_folder), filename, md5)
    119 
--> 120         with zipfile.ZipFile(os.path.join(self.root, self.base_folder, "img_align_celeba.zip"), "r") as f:
    121             f.extractall(os.path.join(self.root, self.base_folder))
    122 

/usr/lib/python3.6/zipfile.py in __init__(self, file, mode, compression, allowZip64)
   1129         try:
   1130             if mode == 'r':
-> 1131                 self._RealGetContents()
   1132             elif mode in ('w', 'x'):
   1133                 # set the modified flag so central directory gets written

/usr/lib/python3.6/zipfile.py in _RealGetContents(self)
   1196             raise BadZipFile("File is not a zip file")
   1197         if not endrec:
-> 1198             raise BadZipFile("File is not a zip file")
   1199         if self.debug > 1:
   1200             print(endrec)

BadZipFile: File is not a zip file

Environment

  • PyTorch version: 1.5.0+cu101

  • Is debug build: No

  • CUDA used to build PyTorch: 10.1

  • OS: Ubuntu 18.04.3 LTS

  • GCC version: (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0

  • CMake version: version 3.12.0

Python version: 3.6

  • Is CUDA available: Yes
  • CUDA runtime version: 10.1.243
  • GPU models and configuration: GPU 0: Tesla T4
  • Nvidia driver version: 418.67
  • cuDNN version: /usr/lib/x86_64-linux-gnu/libcudnn.so.7.6.5

Versions of relevant libraries:

  • [pip3] numpy==1.18.4
  • [pip3] torch==1.5.0+cu101
  • [pip3] torchsummary==1.5.1
  • [pip3] torchtext==0.3.1
  • [pip3] torchvision==0.6.0+cu101

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Reactions:23
  • Comments:19 (3 by maintainers)

github_iconTop GitHub Comments

20reactions
import-antigravitycommented, Mar 16, 2021

This is still an issue FYI

6reactions
Ji-Xinyoucommented, Jun 14, 2022

Problem still exists. (Jun 14)

Read more comments on GitHub >

github_iconTop Results From Across the Web

python - Error downloading celebA dataset using torchvision
Download those files from here, and copy the files in a new directory called celeba; Unzip img_align_celeba.zip in the same directory.
Read more >
CelebA dataset download errors - vision - PyTorch Forums
I am having issues downloading the CelebA dataset. It appears that some of the data is not in .zip format which is throwing...
Read more >
Easiest way to download kaggle data in Google Colab
Please follow the steps below to download and use kaggle data within Google Colab: 1. Go to your account, Scroll to API section...
Read more >
4 Ways to Make a Zip File - wikiHow
1. Create a folder. The quickest way to create a zip file is to place all of the files that you want to...
Read more >
celeb_a | TensorFlow Datasets
Description: CelebFaces Attributes Dataset (CelebA) is a large-scale face attributes dataset with more than 200K celebrity images, each with 40 ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found