Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

torchvision.io.read_image return tensor shape is different.

See original GitHub issue

🐛 Bug

torchvision.io.read_image return tensor shape is different with [3, width, height] on the document when reading the grayscale or RGBA image. It returns [1, width, height] or [4, width, height].

https://pytorch.org/docs/stable/torchvision/io.html#torchvision.io.read_image

To Reproduce

Steps to reproduce the behavior:

>>> img =  torchvision.io.read_image(<grayscale image>)
>>> img.shape
(1, 123, 123)

>>> img =  torchvision.io.read_image(<RGBA image>)
>>> img.shape
(4, 123, 123)

Expected behavior

>>> img =  torchvision.io.read_image(<grayscale image>)
>>> img.shape
(3, 123, 123)

>>> img =  torchvision.io.read_image(<RGBA image>)
>>> img.shape
(3, 123, 123)

Environment

PyTorch version: 1.7.1 Is debug build: False CUDA used to build PyTorch: 10.2 ROCM used to build PyTorch: N/A

OS: Ubuntu 18.04.3 LTS (x86_64) GCC version: (Ubuntu 7.4.0-1ubuntu1~18.04.1) 7.4.0 Clang version: Could not collect CMake version: version 3.10.2

Python version: 3.7 (64-bit runtime) Is CUDA available: True CUDA runtime version: Could not collect GPU models and configuration: GPU 0: GeForce GTX 1080 Ti Nvidia driver version: 440.100 cuDNN version: /usr/lib/x86_64-linux-gnu/libcudnn.so.7.6.5 HIP runtime version: N/A MIOpen runtime version: N/A

Versions of relevant libraries: [pip] numpy==1.19.4 [pip] torch==1.7.1 [pip] torchaudio==0.7.0a0+a853dff [pip] torchvision==0.8.2 [conda] blas 1.0 mkl
[conda] cudatoolkit 10.2.89 hfd86e86_1
[conda] mkl 2020.0 166
[conda] mkl-service 2.3.0 py37he904b0f_0
[conda] mkl_fft 1.0.15 py37ha843d7b_0
[conda] mkl_random 1.1.0 py37hd6b4f25_0
[conda] numpy 1.19.4 pypi_0 pypi [conda] pytorch 1.7.1 py3.7_cuda10.2.89_cudnn7.6.5_0 pytorch [conda] torchaudio 0.7.2 py37 pytorch [conda] torchvision 0.8.2 py37_cu102 pytorch

Issue Analytics

State:
Created 3 years ago
Comments:13 (5 by maintainers)

Top GitHub Comments

2reactions

ckyledacommented, Jun 15, 2021

This works for images that are grayscale; but I have RGB images where the actual channels are important and replicating the information across all channels is not desired behavior.

It blows the mind that defaulting to single-channel image reading was ever implemented in the first place. I suspect this probably means I cannot use torch for my use case.

2reactions

GDkidscommented, May 26, 2021

set the Args ‘mode=ImageReadMode.RGB’ can change output to [3, width, height] class ImageReadMode directly controls it more infomation can be see in ‘https://github.com/pytorch/vision/blob/master/torchvision/io/image.py#L234-L248’ I meet this question today and find this link in the first place I think comment here maybe useful for later viewers

Top Results From Across the Web

read_image — Torchvision main documentation - PyTorch

Reads a JPEG or PNG image into a 3 dimensional RGB or grayscale Tensor. Optionally converts the image to the desired format. The...

The Devil lives in the details | capeblog

TL;DR :torchvision's Resize behaves differently if the input is a PIL.Image or a torch tensor from read_image . Be consistent at training ...

How to use torchvision.io.read_image with image as variable ...

If the images in memory are PIL images, you can use a transform function to convert it to a tensor in the right...

Easy PyTorch to load image and volume from folder - Kaggle

Torchvision is separate from the PyTorch library so you don't even need to import torch, only torchvision. We use the torchvision.io.read_image() function ...

How To Use Torchvision.Io.Readimage With Image ... - ADocLib

scikitimage : For image io and transforms; pandas : For easier csv parsing ... Bug torchvision.io.readimage return tensor shape is different with [3...