MNIST dataset no longer downloads (repeat of old fixed March 2020 problem)
See original GitHub issue🐛 Bug
This seems to be a recurrence of an issue spotted in #1938 which was fixed back in March 2020 and then closed, but has now reappeared. There are a number of people in #1938 reporting that the issue appeared somewhere in the last 12 hours.
To Reproduce
Steps to reproduce the behavior:
- Open a fresh google colab
- Try something like the following:
import torchvision
from torchvision import datasets, transforms
transform = transforms.Compose([transforms.ToTensor(),
transforms.Normalize((0.5,), (0.5,)),
])
trainset = datasets.MNIST('PATH_TO_STORE_TRAINSET', download=True, train=True, transform=transform)
Results in:
Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz to PATH_TO_STORE_TRAINSET/MNIST/raw/train-images-idx3-ubyte.gz
0/? [00:00<?, ?it/s]
---------------------------------------------------------------------------
HTTPError Traceback (most recent call last)
<ipython-input-16-492e382ce34e> in <module>()
4 transforms.Normalize((0.5,), (0.5,)),
5 ])
----> 6 trainset = datasets.MNIST('PATH_TO_STORE_TRAINSET', download=True, train=True, transform=transform)
11 frames
/usr/local/lib/python3.7/dist-packages/torchvision/datasets/mnist.py in __init__(self, root, train, transform, target_transform, download)
77
78 if download:
---> 79 self.download()
80
81 if not self._check_exists():
/usr/local/lib/python3.7/dist-packages/torchvision/datasets/mnist.py in download(self)
144 for url, md5 in self.resources:
145 filename = url.rpartition('/')[2]
--> 146 download_and_extract_archive(url, download_root=self.raw_folder, filename=filename, md5=md5)
147
148 # process and save as torch files
/usr/local/lib/python3.7/dist-packages/torchvision/datasets/utils.py in download_and_extract_archive(url, download_root, extract_root, filename, md5, remove_finished)
254 filename = os.path.basename(url)
255
--> 256 download_url(url, download_root, filename, md5)
257
258 archive = os.path.join(download_root, filename)
/usr/local/lib/python3.7/dist-packages/torchvision/datasets/utils.py in download_url(url, root, filename, md5)
82 )
83 else:
---> 84 raise e
85 # check integrity of downloaded file
86 if not check_integrity(fpath, md5):
/usr/local/lib/python3.7/dist-packages/torchvision/datasets/utils.py in download_url(url, root, filename, md5)
70 urllib.request.urlretrieve(
71 url, fpath,
---> 72 reporthook=gen_bar_updater()
73 )
74 except (urllib.error.URLError, IOError) as e: # type: ignore[attr-defined]
/usr/lib/python3.7/urllib/request.py in urlretrieve(url, filename, reporthook, data)
245 url_type, path = splittype(url)
246
--> 247 with contextlib.closing(urlopen(url, data)) as fp:
248 headers = fp.info()
249
/usr/lib/python3.7/urllib/request.py in urlopen(url, data, timeout, cafile, capath, cadefault, context)
220 else:
221 opener = _opener
--> 222 return opener.open(url, data, timeout)
223
224 def install_opener(opener):
/usr/lib/python3.7/urllib/request.py in open(self, fullurl, data, timeout)
529 for processor in self.process_response.get(protocol, []):
530 meth = getattr(processor, meth_name)
--> 531 response = meth(req, response)
532
533 return response
/usr/lib/python3.7/urllib/request.py in http_response(self, request, response)
639 if not (200 <= code < 300):
640 response = self.parent.error(
--> 641 'http', request, response, code, msg, hdrs)
642
643 return response
/usr/lib/python3.7/urllib/request.py in error(self, proto, *args)
567 if http_err:
568 args = (dict, 'default', 'http_error_default') + orig_args
--> 569 return self._call_chain(*args)
570
571 # XXX probably also want an abstract factory that knows when it makes
/usr/lib/python3.7/urllib/request.py in _call_chain(self, chain, kind, meth_name, *args)
501 for handler in handlers:
502 func = getattr(handler, meth_name)
--> 503 result = func(*args)
504 if result is not None:
505 return result
/usr/lib/python3.7/urllib/request.py in http_error_default(self, req, fp, code, msg, hdrs)
647 class HTTPDefaultErrorHandler(BaseHandler):
648 def http_error_default(self, req, fp, code, msg, hdrs):
--> 649 raise HTTPError(req.full_url, code, msg, hdrs, fp)
650
651 class HTTPRedirectHandler(BaseHandler):
HTTPError: HTTP Error 403: Forbidden
Expected behavior
The dataset should just be loaded (as indeed it was this morning).
Environment
Collecting environment information...
PyTorch version: 1.7.1+cu101
Is debug build: False
CUDA used to build PyTorch: 10.1
ROCM used to build PyTorch: N/A
OS: Ubuntu 18.04.5 LTS (x86_64)
GCC version: (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
Clang version: 6.0.0-1ubuntu2 (tags/RELEASE_600/final)
CMake version: version 3.12.0
Python version: 3.7 (64-bit runtime)
Is CUDA available: False
CUDA runtime version: 11.0.221
GPU models and configuration: Could not collect
Nvidia driver version: Could not collect
cuDNN version: Probably one of the following:
/usr/lib/x86_64-linux-gnu/libcudnn.so.7.6.5
/usr/lib/x86_64-linux-gnu/libcudnn.so.8.0.4
/usr/lib/x86_64-linux-gnu/libcudnn_adv_infer.so.8.0.4
/usr/lib/x86_64-linux-gnu/libcudnn_adv_train.so.8.0.4
/usr/lib/x86_64-linux-gnu/libcudnn_cnn_infer.so.8.0.4
/usr/lib/x86_64-linux-gnu/libcudnn_cnn_train.so.8.0.4
/usr/lib/x86_64-linux-gnu/libcudnn_ops_infer.so.8.0.4
/usr/lib/x86_64-linux-gnu/libcudnn_ops_train.so.8.0.4
HIP runtime version: N/A
MIOpen runtime version: N/A
Versions of relevant libraries:
[pip3] numpy==1.19.5
[pip3] torch==1.7.1+cu101
[pip3] torchsummary==1.5.1
[pip3] torchtext==0.3.1
[pip3] torchvision==0.8.2+cu101
[conda] Could not collect```
cc @pmeier
Issue Analytics
- State:
- Created 3 years ago
- Reactions:6
- Comments:10 (1 by maintainers)
Top Results From Across the Web
How to Develop a CNN for MNIST Handwritten Digit ...
The MNIST handwritten digit classification problem is a standard dataset used in computer vision and deep learning. Although the dataset is ...
Read more >Cannot load MNIST Original dataset using fetch_openml in ...
Method fetch_openml() download dataset from mldata.org which is not stable and can not connect. An alternative way is manually to download ...
Read more >Errata - O'Reilly Media
Version Location Submitted By
Safari Books Online ? Section: Computing Gradients Using Autodiff Thierry Herrmann
ePub Page ch. 14 TensorFlow Implementation Mohammed El‑Beltagy
Mobi Page ch....
Read more >PyTorch Datasets and DataLoaders for deep Learning
Class imbalance is a common problem, but in our case, we have just seen that the Fashion-MNIST dataset is indeed balanced, so we...
Read more >Quantization and Deployment of Deep Neural Networks on ...
By doing so, data do not need to be sent by the device to the cloud anymore. ... of more than 2, without...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
copy this snippet at the top of your notebook, run it, and then just load your datasets as usual…
from six.moves import urllib
opener = urllib.request.build_opener()
opener.addheaders = [('User-agent', 'Mozilla/5.0')]
urllib.request.install_opener(opener)
We have decided to do the user agent fix anyway. It will also make its way in the upcoming release (#3499).