Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

“transforms.functional” Broken When Using Images With Alpha

See original GitHub issue

🐛 Bug

Some “torchvision.transforms.functional” transformations such as “TF.rotate” and “TF.resize” break the image when the image is in RGBA format, as you can see here:

The blue channel before the rotation is left, and after it in the right, everything in where the alpha channel is black was also made black in the other channels.

Steps to reproduce the behavior:

1-Get an image with alpha channel, this is the one I used:

2- Apply some of these functions to it, like this:

import numpy as np
import matplotlib.pyplot as plt
from PIL import Image
import torchvision.transforms.functional as TF

img = Image.open(r'C:\your\folder\test.png')

plt.imshow(np.asarray(img)[:,:,2:3].squeeze(), cmap='gray'); plt.show()

img   = TF.rotate(img, 30) # or img  = TF.resize(img, (512, 512))

plt.imshow(np.asarray(img)[:,:,2:3].squeeze(), cmap='gray'); plt.show()

Expected behavior: It should act similar to the “transforms.RandomRotation” - which works properly - where the data in the first three channels are not affected by the data in the forth channel. Not all problems using “vision” are images, let alone all forth channels are transparency. For example, in my case this is a vector bump map where the first three channels are XYZ normal components, and the forth is a height component.

Note this bug happens because pytorch uses PIL’s built it functions like the rotate function which are not meant for data science, and just throw away data from the previous channels when the corresponding forth channel value is zero.

Workaround: Use opencv_transforms library instead which works properly, it is a rewrite of Pytorch’s transforms to be OpenCV based instead of PIL, so it’s also faster:

https://github.com/jbohnslav/opencv_transforms

Issue Analytics

State:
Created 3 years ago
Comments:7 (4 by maintainers)

Top GitHub Comments

2reactions

vfdev-5commented, Oct 21, 2020

Currently, if we pass input to torch tensor and apply a rotation it gives the following result

import torch

...

img = Image.open('test.png')
np_img = np.asarray(img)
t_img = torch.from_numpy(np_img).permute(2, 0, 1)
out = TF.rotate(t_img, 30)
np_out = out.permute(1, 2, 0).numpy()
print(np_out.shape)

plt.figure(figsize=(20, 7))
for i in range(np_out.shape[-1]):
    plt.subplot(1, 4, i + 1)
    plt.imshow(np_out[:, :, i], cmap='gray')

vs PIL

So, the image data with alpha = 0 inside the image remains present in rotated image, but fill value should be adapted according to the application.

Let’s close the issue as solved and @Huud feel free to reopen if you need more support on that. Thanks

1reaction

fmassacommented, Oct 21, 2020

Now all transforms work on Tensor as well, so that I believe this issue should be fixed if you convert the PIL image to Tensor and apply the transforms directly on Tensor. @vfdev-5 can you double-check and close the issue if that’s the case?

Top Results From Across the Web

"transforms.functional" Broken When Using Images With Alpha

Some “torchvision.transforms.functional” transformations such as “TF.rotate” and “TF.resize” break the image when the image is in RGBA ...

image-segmentation | Kaggle

... import torchvision.transforms.functional as F from torchvision import transforms ... Image.open(mask), alpha=.5) plt.title(f'{os.path.split(img)[-1]}', ...

PyTorch - How to use "toPILImage" correctly - Stack Overflow

import torchvision.transforms.functional as F # load the image (creating a random image as an example) img_data = torch.

https://huggingface.co/spaces/sunshineatnoon/Textu...

... import transforms +import torch.nn.functional as F +import torchvision.transforms.functional as TF +import numpy as np +from PIL import Image, ...

Semantic Segmentation - Deepnote

Imports; Try Fields; Image segmentation; Augmentations; Dataset ... аугментаций import torchvision.transforms.functional as TTF from typing import Callable, ...