Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

RandomGrayscale fails for grayscale tensor inputs

See original GitHub issue

#1505 added rgb_to_grayscale to tensor images and #2586 aligned it with the PIL op. It was missed that the PIL op is a no-op if the input is already grayscale, whereas the tensor op fails:

>>> image_tensor = torch.rand(1, 16, 16)
>>> F.rgb_to_grayscale(image_tensor)
TypeError: Input image tensor permitted channel values are [3], but found 1
>>> image_pil = F.to_pil_image(image_tensor)
>>> F.rgb_to_grayscale(image_pil)
<PIL.Image.Image image mode=L size=16x16 at 0x7F8181384A10>

As the name as well as the docstring

https://github.com/pytorch/vision/blob/d0dede0e09d6d72253080cb366742a270f0ea8cd/torchvision/transforms/functional.py#L1231-L1232

imply, this transform should only handle RGB inputs so that shouldn’t be an issue.

The problem is that the RandomGrayscale transform relies on the no-op behavior:

https://github.com/pytorch/vision/blob/d0dede0e09d6d72253080cb366742a270f0ea8cd/torchvision/transforms/transforms.py#L1613-L1616

>>> transform = transforms.RandomGrayscale(p=1.0)
>>> transform(image_tensor)
TypeError: Input image tensor permitted channel values are [3], but found 1
>>> transform(image_pil)
<PIL.Image.Image image mode=L size=16x16 at 0x7F23E5B4E850>

Note the docstring is conflicting here:

https://github.com/pytorch/vision/blob/d0dede0e09d6d72253080cb366742a270f0ea8cd/torchvision/transforms/transforms.py#L1586-L1587

https://github.com/pytorch/vision/blob/d0dede0e09d6d72253080cb366742a270f0ea8cd/torchvision/transforms/transforms.py#L1595-L1596

BC compatible fix would be to patch RandomGrayscale to be an explicit no-op for grayscale inputs.

cc @vfdev-5 @datumbox

Issue Analytics

State:
Created 2 years ago
Comments:8 (4 by maintainers)

Top GitHub Comments

1reaction

vfdev-5commented, Mar 10, 2022

PIL does a copy as well if modes are same: https://github.com/python-pillow/Pillow/blob/92c26a77ca53a2bfbd8804f009c6c8755d0e5a43/src/PIL/Image.py#L945-L946

1reaction

pmeiercommented, Mar 10, 2022

Are we going to copy or not?

I would, yes. Given that the op currently won’t return the input tensor, users might rely on this. Given that no one raised an issue (that I know of) about this before, I’m guessing this an edge case anyway. Thus, the potential extra copy shouldn’t be a problem here.

Top Results From Across the Web

RandomGrayscale — Torchvision main documentation

RandomGrayscale (p=0.1)[source]. Randomly convert image to grayscale with a probability of p (default 0.1). If the image is torch Tensor, it is expected...

tf.image.rgb_to_grayscale | TensorFlow v2.11.0

Outputs a tensor of the same DType and rank as images . The size of the last dimension of the output is 1,...

4. Transfer Learning and Other Tricks - Programming PyTorch ...

Vertical flip. RandomGrayscale is a similar type of transformation, except that it randomly turns the image grayscale, depending on the parameter p (the ......

How to convert RGB images to grayscale in PyTorch ...

I found an extremely simple solution to this problem. The required dimensions of the tensor are [1,1,28,28] whereas the input tensor is of ......

Operation Reference — NVIDIA DALI 1.20.0 documentation

Function Device support Short description audio_decoder CPU Legacy alias for decoders.audio() . audio_resample CPU, GPU Resamples an audio signal. brightness CPU, GPU Adjusts the brightness of...