Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[Question] Significant difference in performance between Kornia and Torchvision image augmentations

See original GitHub issue

I have trained two models that use the same sequence of image augmentations but in Torchvision and Kornia and I’m observing a significant difference in the performance of these models. I understand that despite fixing random seeds, these augmentations might still be different which might cause some difference in the test accuracies, but on average, I assume that both of these models should end with similar accuracies, especially when these values are averaged over multiple seeds. However, this is not the case.

# PyTorch transformation
train_orig_transform = transforms.Compose([
    transforms.RandomResizedCrop(32),
    transforms.RandomHorizontalFlip(p=0.5),
    transforms.RandomApply([transforms.ColorJitter(0.4, 0.4, 0.4, 0.1)], p=0.8),
    transforms.RandomGrayscale(p=0.2),
    transforms.ToTensor(),
    transforms.Normalize([0.4914, 0.4822, 0.4465], [0.2023, 0.1994, 0.2010])
    ])

This is the Kornia version of the above PyTorch transformation


class KorniaAugmentationModule(nn.Module):
    def __init__(self, batch_size=512):
        super().__init__()
        # These are standard values for CIFAR10
        self.mu = torch.Tensor([0.4914, 0.4822, 0.4465])
        self.sigma = torch.Tensor([0.2023, 0.1994, 0.2010])

        self.hor_flip_prob = 0.5
        self.jit_prob = 0.8
        self.gs_prob = 0.2

        self.crop = K.RandomResizedCrop(size=(32, 32), same_on_batch=False)
        self.hor_flip = K.RandomHorizontalFlip(p=self.hor_flip_prob, same_on_batch=False)
        self.jit = K.ColorJitter(brightness=0.4, contrast=0.4, saturation=0.4, hue=0.1, p=self.jit_prob, same_on_batch=False)
        self.rand_grayscale =  K.RandomGrayscale(p=self.gs_prob, same_on_batch=False)
        
        self.normalize = K.Normalize(self.mu, self.sigma)

    # Note that I should only normalize in test mode; no other type of augmentation should be performed
    
    def forward(self, x, params=None, mode='train'):
        B = x.shape[0]
        if mode == 'train':
                x = self.crop(x, params['crop_params'])
                x = self.hor_flip(x, params['hor_flip_params'])
                x[params['jit_batch_probs']] = self.jit(x[params['jit_batch_probs']], params['jit_params'])
                x = self.rand_grayscale(x, params['grayscale_params'])

        x = self.normalize(x)
        return x

Rest of the code for training and testing these models is shared.

These are the training loss and testing accuracy curves for kornia (orange) and torchvision (green)

The difference in test accuracies between the two models is nearly ~11% which is very significant.

I have posted this question on PyTorch discussion forum as well. Could you please give pointers on why this behavior is being observed, is this expected and if not, what could be ways to debug this?

Issue Analytics

State:
Created 3 years ago
Reactions:3
Comments:21 (10 by maintainers)

Top GitHub Comments

1reaction

juliendenizecommented, Mar 18, 2022

Another difference between Kornia and Torchvision that can be source of error is how to use gaussian blur.

In Kornia sigma is a tuple used deterministically to provide standard deviation for x and y axis:

class GaussianBlur2d(nn.Module):
    ...
    def __init__(self,
                 kernel_size: Tuple[int, int],
                 sigma: Tuple[float, float],
                 border_type: str = 'reflect',
                 separable: bool = True) -> None:
        super().__init__()
        self.kernel_size: Tuple[int, int] = kernel_size
        self.sigma: Tuple[float, float] = sigma
        ...

    def forward(self, input: torch.Tensor) -> torch.Tensor:
        return gaussian_blur2d(input,
                       self.kernel_size,
                       self.sigma,  # sigma directly used
                       self.border_type,
                       self.separable)

sigma = (0.1, 2.) # std 0.1 for y axis and 2. for x axis
kernel_size = (23, 23)
GaussianBlur2d(kernel_size, sigma)

In torchvision sigma can be either a float used for both axes or a tuple used to uniformly sample a sigma used for both axes:

class GaussianBlur(torch.nn.Module):
    ...
    def __init__(self, kernel_size, sigma=(0.1, 2.0)):
        ...
        self.sigma = sigma

    @staticmethod
    def get_params(sigma_min: float, sigma_max: float) -> float:
        return torch.empty(1).uniform_(sigma_min, sigma_max).item()

    def forward(self, img: Tensor) -> Tensor:
        sigma = self.get_params(self.sigma[0], self.sigma[1]) # sigma has been uniformly sampled
        return F.gaussian_blur(img, self.kernel_size, [sigma, sigma]) # use the sigma sampled for both axes
sigma = (0.1, 2.) # 0.1 sigma min and 2. sigma max
kernel_size = (23, 23)
GaussianBlur(kernel_size, sigma)

To have similar behavior with torchvision, one can create a RandomGaussianBlur2d as follows:

class RandomGaussianBlur2d(nn.Module):
    ...
    def __init__(self,
                 kernel_size: Tuple[int, int],
                 sigma: Tuple[float, float],
                 border_type: str = 'reflect',
                 separable: bool = True) -> None:
        super().__init__()
        self.kernel_size: Tuple[int, int] = kernel_size
        self.sigma: Tuple[float, float] = sigma
        ...

    def forward(self, input: torch.Tensor) -> torch.Tensor:
        sigma = torch.empty(1).uniform_(sigma_min, sigma_max).item() # sample sigma for both axes
        return gaussian_blur2d(input,
                       self.kernel_size,
                       (sigma, sigma),  # sigma sampled used
                       self.border_type,
                       self.separable)

sigma = (0.1, 2.)
kernel_size = (23, 23)
RandomGaussianBlur2d(kernel_size, sigma)

However, this solution does not produce randomness for sigma on the instance-level. To have randomness on instances it would require sampling a sigma tensor for each instance to transform, creating filters accordingly to sigma, reshaping the input and stacking the filters to apply grouped convolution. Then, reshaping back the output of convolution to match the input shape.

Both libraries work accordingly to their documentation, so it is not a bug however it might be an issue for some users if you overlook documentation (as I did) as both APIs use the same argument name sigma but for different purposes.

1reaction

lferrazcommented, Feb 12, 2022

I checked the code in https://github.com/rdevon/kornia/commit/09d95360837e557a5d83f3bd891f0271d546ee62 and I think the easiest approach is to add a flag (OPENCV/PIL) to those 2 funcionalities. By default set OPENCV and then we do not need to change anything else. What do you think @edgarriba @shijianjian @gmberton ?

we are talking about 10 lines of code… shouldn’t be a drama.

Top Results From Across the Web

Significant difference in performance between Torchvision ...

I have trained two models that use the same sequence of image augmentations but in Torchvision and Kornia and I'm observing a significant...

Image Augmentation - Kornia - Read the Docs

Image Augmentation is a data augmentation method that generates more training data from the existing training samples. Image Augmentation is especially ...

A SURVEY ON Kornia - arXiv

Table 1: Comparison between different computer vision libraries by their main features. Kornia and tensorflow-image.

an Open Source Differentiable Computer Vision Library for ...

This work presents Kornia – an open source computer vision library which consists of a set of differentiable rou- tines and modules to...

Differentiable Data Augmentation with Kornia

As we state in. Table 1, TorchVision/Albumentations show a better performance when lower computational resources are required (e.g. small image size, less ...