question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Making `transformation` an optional parameter in FasterRCNN

See original GitHub issue

Hello,

🚀 Feature

I think it would be more generic to have transform(https://github.com/pytorch/vision/blob/3d65fc6723f1e0709916f24d819d6e17a925b394/torchvision/models/detection/faster_rcnn.py#L231) as a function that can be modified by users rather than a default one.

Motivation

I am applying transformations separately as a part of data augmentation, which includes cropping and resizing. Hence I would prefer to not do the twice while retraining FasterRCNN.

Pitch

I would like to have a fixed size input to be fed into the network for variable-sized images. At present, I do this by resizing the images separately as a part of DataLoader and adjust the parameters of GeneralizedRCNNTransform accordingly.

Alternatives

My present way of using FasterRCNN is an alternative. Since my set of transformations are pre-defined, I have to apply hacks such as setting mean to 0., std to 1. and altering min and max sizes to my default value(this would mean that scale=1 and interpolation would return the same image.

Additional context

While the input to the network is fixed size, I apply many other augmentations such as mirror, random cropping etc, inspired by SSD based networks. Hence I would prefer to do all augmentation in a separate place once instead of twice.

Thank you!

Edit : If you think this would be a meaningful change, I will be happy to send a Pull Request.

Issue Analytics

  • State:open
  • Created 3 years ago
  • Reactions:3
  • Comments:9 (3 by maintainers)

github_iconTop GitHub Comments

3reactions
Zhylkaaacommented, Apr 25, 2021

I wonder if it’s still interesting to someone, I can give it a try, should I open new PR or continue #2288?

1reaction
Zhylkaaacommented, May 17, 2021

ok, so I will try to fix it. For now I am using messy but working workaround (@qornifmlx, maybe this will help you):

model = FasterRCNN(backbone,
                       rpn_anchor_generator=anchor_generator,
                       box_roi_pool=roi_pooler,
                       box_head=box_head,
                       box_predictor=box_predictor,
                       image_mean=(0,),
                       image_std=(1,),
                       max_size=1000,
                       min_size=220) 

    # calculate mean and std from data

    transform = T.Normalize(image_mean=(mean,), image_std=(std,))
    model.transform = transform

Normalize was just ripped from torchvision/models/detection/transform from GeneralizedRCNNTransform class But I guess, that you are already using some kind of preprocessing so you can leave this with image_mean=0 and image_std=1 but there should be something in transform and it should have

 def postprocess(self,
                    result,               # type: List[Dict[str, Tensor]]
                    image_shapes,         # type: List[Tuple[int, int]]
                    original_image_sizes  # type: List[Tuple[int, int]]
                    ):

because it is called in line 99 in generalized RCNN

Read more comments on GitHub >

github_iconTop Results From Across the Web

Understanding Faster R-CNN Configuration Parameters
For tuning the model, we need to understand what each parameter mean in the config file. Below are the details of Faster R-CNN...
Read more >
gluoncv.data.transforms
A util function to transform all images to tensors as network input by applying normalizations. FasterRCNNDefaultTrainTransform. Default Faster-RCNN ...
Read more >
Transform optional parameters to class with object initializer
When transforming parameters to class, is it possible for the constructor of the class created to only have the required parameters and...
Read more >
DetectNet_v2 — Transfer Learning Toolkit 3.0 documentation
The transformation matrices for these operations are computed ... Use this argument when DetectNet_v2 and FasterRCNN also set up the ...
Read more >
TorchVision Object Detection Finetuning Tutorial - PyTorch
If your model returns the above methods, they will make it work for both ... torchvision.models.mobilenet_v2(weights="DEFAULT").features # FasterRCNN needs ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found