question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

torchvision.io.read_image return tensor shape is different.

See original GitHub issue

🐛 Bug

torchvision.io.read_image return tensor shape is different with [3, width, height] on the document when reading the grayscale or RGBA image. It returns [1, width, height] or [4, width, height].

https://pytorch.org/docs/stable/torchvision/io.html#torchvision.io.read_image

To Reproduce

Steps to reproduce the behavior:

>>> img =  torchvision.io.read_image(<grayscale image>)
>>> img.shape
(1, 123, 123)

>>> img =  torchvision.io.read_image(<RGBA image>)
>>> img.shape
(4, 123, 123)

Expected behavior

>>> img =  torchvision.io.read_image(<grayscale image>)
>>> img.shape
(3, 123, 123)

>>> img =  torchvision.io.read_image(<RGBA image>)
>>> img.shape
(3, 123, 123)

Environment

PyTorch version: 1.7.1 Is debug build: False CUDA used to build PyTorch: 10.2 ROCM used to build PyTorch: N/A

OS: Ubuntu 18.04.3 LTS (x86_64) GCC version: (Ubuntu 7.4.0-1ubuntu1~18.04.1) 7.4.0 Clang version: Could not collect CMake version: version 3.10.2

Python version: 3.7 (64-bit runtime) Is CUDA available: True CUDA runtime version: Could not collect GPU models and configuration: GPU 0: GeForce GTX 1080 Ti Nvidia driver version: 440.100 cuDNN version: /usr/lib/x86_64-linux-gnu/libcudnn.so.7.6.5 HIP runtime version: N/A MIOpen runtime version: N/A

Versions of relevant libraries: [pip] numpy==1.19.4 [pip] torch==1.7.1 [pip] torchaudio==0.7.0a0+a853dff [pip] torchvision==0.8.2 [conda] blas 1.0 mkl
[conda] cudatoolkit 10.2.89 hfd86e86_1
[conda] mkl 2020.0 166
[conda] mkl-service 2.3.0 py37he904b0f_0
[conda] mkl_fft 1.0.15 py37ha843d7b_0
[conda] mkl_random 1.1.0 py37hd6b4f25_0
[conda] numpy 1.19.4 pypi_0 pypi [conda] pytorch 1.7.1 py3.7_cuda10.2.89_cudnn7.6.5_0 pytorch [conda] torchaudio 0.7.2 py37 pytorch [conda] torchvision 0.8.2 py37_cu102 pytorch

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:13 (5 by maintainers)

github_iconTop GitHub Comments

2reactions
ckyledacommented, Jun 15, 2021

This works for images that are grayscale; but I have RGB images where the actual channels are important and replicating the information across all channels is not desired behavior.

It blows the mind that defaulting to single-channel image reading was ever implemented in the first place. I suspect this probably means I cannot use torch for my use case.

2reactions
GDkidscommented, May 26, 2021

set the Args ‘mode=ImageReadMode.RGB’ can change output to [3, width, height] class ImageReadMode directly controls it more infomation can be see in ‘https://github.com/pytorch/vision/blob/master/torchvision/io/image.py#L234-L248’ I meet this question today and find this link in the first place I think comment here maybe useful for later viewers

Read more comments on GitHub >

github_iconTop Results From Across the Web

read_image — Torchvision main documentation - PyTorch
Reads a JPEG or PNG image into a 3 dimensional RGB or grayscale Tensor. Optionally converts the image to the desired format. The...
Read more >
The Devil lives in the details | capeblog
TL;DR :torchvision's Resize behaves differently if the input is a PIL.Image or a torch tensor from read_image . Be consistent at training ...
Read more >
How to use torchvision.io.read_image with image as variable ...
If the images in memory are PIL images, you can use a transform function to convert it to a tensor in the right...
Read more >
Easy PyTorch to load image and volume from folder - Kaggle
Torchvision is separate from the PyTorch library so you don't even need to import torch, only torchvision. We use the torchvision.io.read_image() function ...
Read more >
How To Use Torchvision.Io.Readimage With Image ... - ADocLib
scikitimage : For image io and transforms; pandas : For easier csv parsing ... Bug torchvision.io.readimage return tensor shape is different with [3...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found