Making `transformation` an optional parameter in FasterRCNN
See original GitHub issueHello,
🚀 Feature
I think it would be more generic to have transform
(https://github.com/pytorch/vision/blob/3d65fc6723f1e0709916f24d819d6e17a925b394/torchvision/models/detection/faster_rcnn.py#L231) as a function that can be modified by users rather than a default one.
Motivation
I am applying transformations separately as a part of data augmentation, which includes cropping and resizing. Hence I would prefer to not do the twice while retraining FasterRCNN
.
Pitch
I would like to have a fixed size input to be fed into the network for variable-sized images. At present, I do this by resizing the images separately as a part of DataLoader and adjust the parameters of GeneralizedRCNNTransform
accordingly.
Alternatives
My present way of using FasterRCNN
is an alternative. Since my set of transformations are pre-defined, I have to apply hacks such as setting mean to 0., std to 1. and altering min and max sizes to my default value(this would mean that scale=1
and interpolation would return the same image.
Additional context
While the input to the network is fixed size, I apply many other augmentations such as mirror, random cropping etc, inspired by SSD based networks. Hence I would prefer to do all augmentation in a separate place once instead of twice.
Thank you!
Edit : If you think this would be a meaningful change, I will be happy to send a Pull Request.
Issue Analytics
- State:
- Created 3 years ago
- Reactions:3
- Comments:9 (3 by maintainers)
I wonder if it’s still interesting to someone, I can give it a try, should I open new PR or continue #2288?
ok, so I will try to fix it. For now I am using messy but working workaround (@qornifmlx, maybe this will help you):
Normalize was just ripped from
torchvision/models/detection/transform
from GeneralizedRCNNTransform class But I guess, that you are already using some kind of preprocessing so you can leave this with image_mean=0 and image_std=1 but there should be something in transform and it should havebecause it is called in line 99 in generalized RCNN