RandomGrayscale fails for grayscale tensor inputs
See original GitHub issue#1505 added rgb_to_grayscale
to tensor images and #2586 aligned it with the PIL op. It was missed that the PIL op is a no-op if the input is already grayscale, whereas the tensor op fails:
>>> image_tensor = torch.rand(1, 16, 16)
>>> F.rgb_to_grayscale(image_tensor)
TypeError: Input image tensor permitted channel values are [3], but found 1
>>> image_pil = F.to_pil_image(image_tensor)
>>> F.rgb_to_grayscale(image_pil)
<PIL.Image.Image image mode=L size=16x16 at 0x7F8181384A10>
As the name as well as the docstring
imply, this transform should only handle RGB inputs so that shouldn’t be an issue.
The problem is that the RandomGrayscale
transform relies on the no-op behavior:
>>> transform = transforms.RandomGrayscale(p=1.0)
>>> transform(image_tensor)
TypeError: Input image tensor permitted channel values are [3], but found 1
>>> transform(image_pil)
<PIL.Image.Image image mode=L size=16x16 at 0x7F23E5B4E850>
Note the docstring is conflicting here:
BC compatible fix would be to patch RandomGrayscale
to be an explicit no-op for grayscale inputs.
Issue Analytics
- State:
- Created 2 years ago
- Comments:8 (4 by maintainers)
Top Results From Across the Web
RandomGrayscale — Torchvision main documentation
RandomGrayscale (p=0.1)[source]. Randomly convert image to grayscale with a probability of p (default 0.1). If the image is torch Tensor, it is expected...
Read more >tf.image.rgb_to_grayscale | TensorFlow v2.11.0
Outputs a tensor of the same DType and rank as images . The size of the last dimension of the output is 1,...
Read more >4. Transfer Learning and Other Tricks - Programming PyTorch ...
Vertical flip. RandomGrayscale is a similar type of transformation, except that it randomly turns the image grayscale, depending on the parameter p (the ......
Read more >How to convert RGB images to grayscale in PyTorch ...
I found an extremely simple solution to this problem. The required dimensions of the tensor are [1,1,28,28] whereas the input tensor is of ......
Read more >Operation Reference — NVIDIA DALI 1.20.0 documentation
Function Device support Short description
audio_decoder CPU Legacy alias for decoders.audio() .
audio_resample CPU, GPU Resamples an audio signal.
brightness CPU, GPU Adjusts the brightness of...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
PIL does a copy as well if modes are same: https://github.com/python-pillow/Pillow/blob/92c26a77ca53a2bfbd8804f009c6c8755d0e5a43/src/PIL/Image.py#L945-L946
I would, yes. Given that the op currently won’t return the input tensor, users might rely on this. Given that no one raised an issue (that I know of) about this before, I’m guessing this an edge case anyway. Thus, the potential extra copy shouldn’t be a problem here.