Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[docs] RandomResizeCrop signature quirks

See original GitHub issue

Why do we need to have so much precision: 1.333333333333? 1.3 would suffice and be more readable
<InterpolationMode.BILINEAR: 'bilinear'>. It’s not clear from this if the enum should be preferred or the constant (and it’s also not copy-pastable in the code). It’s also not clear which type is used for InterpolationMode, especially given that for a long time only PIL.Image.InterpolationMode was supported, hence even new codebases are using the legacy PIL types: https://github.com/facebookresearch/dino/blob/58aabc0/main_dino.py#L420
Why does InterpolationMode live in torchvision.transforms.functional whereas the type using it lives in torchvision.transforms? This enum is basic enough to deserve placement in torchvision.transforms directly IMHO.
The doc text mentions torchvision.transforms.InterpolationMode, but it’s incorrect since the type currently lives in transforms.functional: https://pytorch.org/vision/master/transforms.html?highlight=randomresizedcrop#torchvision.transforms.functional.InterpolationMode

Issue Analytics

State:
Created 2 years ago
Comments:11 (5 by maintainers)

Top GitHub Comments

1reaction

vadimkantorovcommented, May 6, 2021

I propose to also have strings supported as interpolation mode (as Kornia does)…

0reactions

fmassacommented, May 10, 2021

These interpolation modes naming are quite standard across opencv / pil / kornia

Well, each of those libraries have different names / conventions for each method, but both PIL and OpenCV use some sort of enums (and not strings) for representing the mode. From this perspective, using a proper Python enum seems adapted to me. But I would be ok making it support a string as well (which is what TF does for tf.image.resize )

So some clearness / examples would be great - and copy-pasteability, that is the enum type should be fully qualified (despite being verbose)

That sounds ok to me.

Probably also to just be supporting exactly the same interface as F.interpolate does, to not force users have different interoplate mode typing in different parts of pytorch

This is something which is worth further discussion. Now that torchvision transforms support both Tensor and PIL, there is indeed a question on wether we should follow PIL (BC) or lean towards PyTorch.

IMO, the current situation with PyTorch interpolate is a bit messy, with interpolation modes being linear / bilinear / trilinear / etc, while only one single option actually works for a given input shape. I would lean towards not following what PyTorch does wrt that, but instead focus on providing more generic abstractions in PyTorch for interpolating over any arbitrary number of dimensions, which is what I proposed in https://github.com/pytorch/pytorch/issues/10482 and that we are slowly getting there by unifying the CPU implementations of all those methods, see https://github.com/pytorch/pytorch/pull/54500

Top Results From Across the Web

RandomResizedCrop — Torchvision main documentation

Crop a random portion of image and resize it to a given size. ... A crop of the original image is made: the...

Deep Learning with PyTorch

excellent official documentation. Although we stress the practical aspects of building deep learning systems with. PyTorch, we believe that providing an ...

Deep-Learning-with-PyTorch - Yumpu

Read the latest magazines about Deep-Learning-with-PyTorch and discover magazines on Yumpu.com.

PyTorch – torchvision.transforms – RandomResizedCrop()

RandomResizedCrop () transform crops a random area of the original input image. This crop size is randomly selected and finally the cropped ...

Profile for Stack Overflow - Linknovate

The Rust community also provides extensive documentation and online help, ... Each test function has the following signature structure: func TestXxx ...