[RFC] New Augmentation techniques in Torchvison

See original GitHub issue

🚀 Feature

Inclusion of new Augmentation techniques in torchvision.transforms.

Motivation

Transforms are important for data augmentation 😅

Proposals

RandAugment Citations 306 #4348
~~Cutout Citations 964~~ Superseded by RandomErasing
MixUp Citations 1624 Code (Note it is CC-BY-NC-license) #4379
CutMix Citations 437 Code ICCV 2019 #4379
TrivialAugment ICCV 2021 #4221
AugMix Citations 157
Scale-Jitter Citations 11 - code - benchmarks
Simple CopyPaste #5825
Port SSD & SSDlite Augmentations and Mixup/Cutmix from references to vision
Detection Transforms #1406 #2213 #3980
AutoAugment Detection 231 Citations code
Greedy Search Policy
FastAutoAugment Citations 135 NeurIPS 2019 Code (1k+ GitHub stars)
ReMixMatch Citations 165
FixMatch Citations 258
Transforms for color spaces augmentations E.g. rgb2hsv, rgb2bgr, rgb2lab, etc #4029
Gaussian Noise See #6192

Additional context

To visitors Kindly give a 👍 if you think any of these would help in your work.

Also if you have any transform in mind please provide few details here!

Linked to #3221

cc @vfdev-5 @fmassa

Issue Analytics

State:
Created 2 years ago
Reactions:16
Comments:13 (11 by maintainers)

Top GitHub Comments

3reactions

oke-adityacommented, May 25, 2021

I think the same. I compared both the implementations. RandomErasing is newer than Cutout, also both the augmentations produce almost similar results.

Also, As per docs RandomErasing does not work for for PIL Images. It works only for torch.Tensor. I am not sure if that is intentional or needs some work.

2reactions

datumboxcommented, Jun 21, 2022

@lezwon Thanks a lot for offering to help!

ReMixMatch focuses on learning augmentations and on using unlabelled data. One challenge with that is that the majority of the changes will have to land on references which are outside of TorchVision. Currently the reference scripts are in need of some rework to reduce the amount of duplicate code and improve the overall quality. It’s on the top of our todos and until that’s done, ideally we would like to avoid introducing significantly complex techniques like ReMixMatch.

I wonder if you would be interested in implementing the AutoAugment Detection algorithm listed above. @vfdev-5 has already added most of the necessary low-level kernels for doing transforms on the BBoxes in torchvision.prototype, so what’s needed is to implement the AutoAugment technique itself. Of course since it touches prototype APIs it can be tricky too. Let me know your thoughts and perhaps Victor can also pitch in to see if it makes sense to work together and test the new API. Alternatively we can discuss for another contribution that you find interesting.

BTW I’m currently working on the SimpleCopyPaste contribution trying to if we can train more accurate models using it. I’ll let you know when I have the full results. 😃

Top Results From Across the Web

Data Augmentations in Torchvision | by Maciej Balawejder

Data Augmentations in Torchvision. This blog aims to compare and familiarise with different data transformations techniques used by the research community.

A sneak peek at TorchVision v0.11 – Memoirs of a ... - Datumbox

Moto Hira, Parmeet Singh Bhatia and I have drafted an RFC, which proposes a new mechanism for Model Versioning and for handling meta-data ......

Transforming and augmenting images - PyTorch

Transforms are common image transformations available in the torchvision.transforms module. They can be chained together using Compose . Most transform classes ...

Image Augmentation - Kornia - Read the Docs

Image Augmentation is a data augmentation method that generates more training data from the existing training samples. Image Augmentation is especially ...

How Data Augmentation Improves your CNN performance?

The book Deep Learning defines regularization as any method that modifies the learning algorithm in a way that is intended to reduce the ......