question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Download dataset return 403

See original GitHub issue

🐛 Bug

To Reproduce

Steps to reproduce the behavior:

import torch.backends.cudnn
import torch.utils.data
import torchvision

# prepare parameters
n_epochs = 1  # 3
batch_size_train = 64
batch_size_test = 1000
learning_rate = 0.01
momentum = 0.5
log_interval = 10

random_seed = 1
torch.backends.cudnn.enabled = False
torch.manual_seed(random_seed)

# prepare dataset
train_loader = torch.utils.data.DataLoader(
    torchvision.datasets.MNIST('./tmp/files/', train=True, download=True,
                               transform=torchvision.transforms.Compose([
                                   torchvision.transforms.ToTensor(),
                                   torchvision.transforms.Normalize((0.1307,), (0.3081,))
                               ])),
    batch_size=batch_size_train, shuffle=True)

test_loader = torch.utils.data.DataLoader(
    torchvision.datasets.MNIST('./tmp/files/', train=False, download=True,
                               transform=torchvision.transforms.Compose([
                                   torchvision.transforms.ToTensor(),
                                   torchvision.transforms.Normalize(
                                       (0.1307,), (0.3081,))
                               ])),
    batch_size=batch_size_test, shuffle=True)


Expected behavior

Download dataset.

Environment

Please copy and paste the output from our environment collection script (or fill out the checklist below manually).

You can get the script and run it with:

wget https://raw.githubusercontent.com/pytorch/pytorch/master/torch/utils/collect_env.py
# For security purposes, please check the contents of collect_env.py before running it.
python collect_env.py

PyTorch version: 1.5.1+cu101 Is debug build: False CUDA used to build PyTorch: 10.1 ROCM used to build PyTorch: N/A

OS: Ubuntu 16.04.6 LTS (x86_64) GCC version: (Ubuntu 5.4.0-6ubuntu1~16.04.12) 5.4.0 20160609 Clang version: Could not collect CMake version: version 3.5.1

Python version: 3.6 (64-bit runtime) Is CUDA available: True CUDA runtime version: Could not collect GPU models and configuration: GPU 0: GeForce RTX 2080 Ti GPU 1: GeForce RTX 2080 Ti

Nvidia driver version: 440.33.01 cuDNN version: /usr/local/cuda-10.2/targets/x86_64-linux/lib/libcudnn.so.7 HIP runtime version: N/A MIOpen runtime version: N/A

Versions of relevant libraries: [pip3] numpy==1.18.5 [pip3] numpydoc==1.1.0 [pip3] torch==1.5.1+cu101 [pip3] torchtext==0.7.0 [pip3] torchvision==0.6.1+cu101 [conda] blas 1.0 mkl
[conda] mkl 2020.1 217
[conda] mkl-service 2.3.0 py36he904b0f_0
[conda] mkl_fft 1.1.0 py36h23d657b_0
[conda] mkl_random 1.1.1 py36h0573a6f_0
[conda] numpy 1.18.5 py36ha1c710e_0
[conda] numpy-base 1.18.5 py36hde5b4d6_0
[conda] numpydoc 1.1.0 py_0
[conda] torch 1.5.1+cu101 pypi_0 pypi [conda] torchtext 0.7.0 pypi_0 pypi [conda] torchvision 0.6.1+cu101 pypi_0 pypi

Additional context

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:6 (1 by maintainers)

github_iconTop GitHub Comments

5reactions
vfdev-5commented, Mar 4, 2021

For those who can not install torchvision master with the fix, you can try the following workaround = download and preprocess the dataset manually:

  • go to MNIST folder
  • create 2 scripts there
download.sh

# sudo apt-get update && apt-get install -y wget p7zip-full

mkdir -p raw
mkdir -p processed

cd raw

wget http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz
wget http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz
wget http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz
wget http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz

7z x train-images-idx3-ubyte.gz
7z x train-labels-idx1-ubyte.gz
7z x t10k-images-idx3-ubyte.gz
7z x t10k-labels-idx1-ubyte.gz

cd ..

python process.py
process.py
import os
import torch
from torchvision.datasets.mnist import read_image_file, read_label_file

raw_folder = "raw"
processed_folder = "processed"
training_file = 'training.pt'
test_file = 'test.pt'

### Code from https://github.com/pytorch/vision/blob/7d4154735f421b254c408c16e0980b1ca0dd9b8e/torchvision/datasets/mnist.py#L134
# process and save as torch files
print('Processing...')

training_set = (
    read_image_file(os.path.join(raw_folder, 'train-images.idx3-ubyte')),
    read_label_file(os.path.join(raw_folder, 'train-labels.idx1-ubyte'))
)
test_set = (
    read_image_file(os.path.join(raw_folder, 't10k-images.idx3-ubyte')),
    read_label_file(os.path.join(raw_folder, 't10k-labels.idx1-ubyte'))
)
with open(os.path.join(processed_folder, training_file), 'wb') as f:
    torch.save(training_set, f)
with open(os.path.join(processed_folder, test_file), 'wb') as f:
    torch.save(test_set, f)

print('Done!')
  • make sure to have installed : wget, 7z, torchvision and torch
  • run sh download.sh, it should download and preprocess the dataset
  • use torchvision’s MNIST code as usual

HTH

0reactions
liqing9399commented, Mar 21, 2021

RuntimeError: shape ‘[60000, 28, 28]’ is invalid for input of size 4482028 When I used the above process.py script, I reported this error. What was the reason?

Read more comments on GitHub >

github_iconTop Results From Across the Web

403 - Forbidden · Issue #87 · Kaggle/kaggle-api - GitHub
After accepting rules we can download dataset by using kaggle command. https://www.kaggle.com/c/titanic/rules. But in this case, kaggle command ...
Read more >
Getting "HTTP Error 403: Forbidden" error when download ...
I use following code to get the MNIST dataset ...
Read more >
Forbidden 403 Error when downloading images from csv
Forbidden 403 Error when downloading images from csv. ... You can even use Kaggle's API to create a dataset from your local command...
Read more >
403 Forbidden Error: What It Is and How to Fix It - Airbrake Blog
The 403 Forbidden Error is an HTTP response status code that indicates an identified client does not have proper authorization to access the ......
Read more >
How to Fix a 403 Forbidden Error on Your WordPress Site
A server that wishes to make public why the request has been forbidden can describe that reason in the response payload (if any)....
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found