question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

torchvision.io.read_image does not always fail gracefully

See original GitHub issue

🐛 Bug

torchvision.io.read_image() will sometimes segfault or abort in other uncatchable ways on malformed images, rather than failing gracefully (e.g. with a RuntimeError).

To Reproduce

Steps to reproduce the behavior:

  1. Download a problematic image file (one that I have found is here)
  2. Try to load the image with torchvision.io.read_image:
>>> import torchvision
>>> image = torchvision.io.read_image("283xnnabju4z.png")
libpng warning: iCCP: known incorrect sRGB profile
munmap_chunk(): invalid pointer
Aborted (core dumped)

Expected behavior

I expected that trying to read an unsupported or malformed image would instead raise a RuntimeError or other catchable error so that it could be handled in code, rather than aborting.

Environment

PyTorch version: 1.8.1+cu102 Is debug build: False CUDA used to build PyTorch: 10.2 ROCM used to build PyTorch: N/A

OS: Ubuntu 18.04.5 LTS (x86_64) GCC version: (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0 Clang version: 6.0.0-1ubuntu2 (tags/RELEASE_600/final) CMake version: version 3.20.0

Python version: 3.8 (64-bit runtime) Is CUDA available: True CUDA runtime version: Could not collect GPU models and configuration: GPU 0: GeForce GTX 1050 Nvidia driver version: 460.67 cuDNN version: /usr/local/cuda-10.2/lib64/libcudnn.so.7.6.4 HIP runtime version: N/A MIOpen runtime version: N/A

Versions of relevant libraries: [pip3] numpy==1.20.1 [pip3] torch==1.8.1 [pip3] torchvision==0.9.1

Additional context

Something even more strange also happens with this particular image, which is that setting the mode to ImageReadMode.RGB will allow it to be read once, but attempting to read it a second time fails as above (i.e. torchvision.io.read_image is not idempotent). I’m not sure if this behavior is unrelated, but whatever the root cause is, it would be nice to be able to just catch an error, e.g. to log the filename and skip the image during processing.

>>> import torchvision
>>> image = torchvision.io.read_image("283xnnabju4z.png", mode=torchvision.io.image.ImageReadMode.RGB)
libpng warning: iCCP: known incorrect sRGB profile
>>> image.shape
torch.Size([3, 1410, 2048])
>>> image = torchvision.io.read_image("283xnnabju4z.png", mode=torchvision.io.image.ImageReadMode.RGB)
libpng warning: iCCP: known incorrect sRGB profile
munmap_chunk(): invalid pointer
Aborted (core dumped)

Some quick investigation shows that the problematic images that exhibit this behavior are usually PNGs with a depth of 16 bits. OpenCV and PIL do not appear to have problems reading them.

Additionally, the error message changes sometimes, e.g. to Segmentation fault or double free or corruption (out).

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:7 (7 by maintainers)

github_iconTop GitHub Comments

1reaction
andfoycommented, Mar 30, 2021

It seems like the error happens when the png reading function is trying to destroy the png reading structure after catching the error, that means that torchvision is catching the error, but it causes a segfault when calling png_destroy_read_struct on

https://github.com/pytorch/vision/blob/978ba613518dd5b8d04c2a717e6e5ccf7fb172c3/torchvision/csrc/io/image/cpu/decode_png.cpp#L35

Which in turn calls https://github.com/glennrp/libpng/blob/a37d4836519517bdce6cb9d956092321eca3e73b/pngread.c#L948, where png_free is an alias to free. Therefore this error is related to memory management. I checked if big_row_buf was NULL, but it wasn’t.

In my reproduction scenario, torchvision was able to load the image once, but the second call caused the segfault and produced the message libpng error: IDAT: bad parameters to zlib. Which according to this issue https://github.com/ContinuumIO/anaconda-issues/issues/7315, it might be related to the version of zlib used when libpng is invoked. An user commented that the segfault occurred on the second call to libpng, which is the same scenario that we are having right now.

The proposed solution involves downgrading the zlib version (which I haven’t verified myself). I’ll try to compile ZLib as well as libpng to see if we can get more information.

0reactions
fmassacommented, Jun 24, 2021

@NicolasHug yes, it would be good to have an issue to track supporting pngs with more than 8 bits.

Read more comments on GitHub >

github_iconTop Results From Across the Web

read_image — Torchvision main documentation - PyTorch
Reads a JPEG or PNG image into a 3 dimensional RGB or grayscale Tensor. Optionally converts the image to the desired format. The...
Read more >
How to use torchvision.io.read_image with image as variable ...
io.read_image uses as an input file stored in path argument. How can I achieve the same output if the image is stored as...
Read more >
See raw diff - Hugging Face
See error + +**Expected behavior** +- A clear and concise description of what you ... pipenv may install dependencies that don't work, or...
Read more >
The Devil lives in the details | capeblog
read_image and the torchvision.transforms.Resize . This transform can accept PIL.Image.Image or Tensors, in short, the resizing does not produce ...
Read more >
Convert image to float numpy - Casale Giacinta
Also, in the list returned, the data items do not retain their numpy data ... Python list has always the 2. array(a, dtype=float)...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found