Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Make functions work with arbitrary-dimensional images and videos

See original GitHub issue

🚀 Feature

To make most functionals work with images and videos of any dimensions.

Motivation

Most functions require input to be of specific shapes. This is quite restricted and deep learning-centric. I would like to reduce this restriction and make them work with inputs of any dimensions. This will agree with torchvision, kornia.color module and this PR.

Pitch

So far, the following functions can be converted to support this

kornia.enhance.adjust.solarize

kornia.enhance.adjust.posterize

kornia.enhance.adjust.sharpness

kornia.enhance.adjust.equalize

kornia.enhance.adjust.equalize3d

kornia.enhance.core.add_weighted

kornia.enhance.equalization.equalize_clahe

kornia.enhance.normalize.normalize_min_max

kornia.filters.blur.box_blur

kornia.filters.blur.blur_pool2d

kornia.filters.blur.max_blur_pool2d

kornia.filters.blur.canny

kornia.filters.filter.filter2D

kornia.filters.filter.filter3D

kornia.filters.laplacian.laplacian

kornia.filters.median.median_blur

kornia.filters.motion.motion_blur3d

kornia.filters.motion.sobel.spatial_gradient

kornia.filters.motion.sobel.spatial_gradient3d

kornia.filters.motion.sobel.sobel

kornia.filters.unsharp.unsharp_mask

kornia.geometry.subpix.dsnt.spatial_softmax2d

kornia.geometry.subpix.dsnt.spatial_expectation2d

kornia.geometry.subpix.spatial_soft_argmax.conv_soft_argmax2d

kornia.geometry.subpix.spatial_soft_argmax.conv_soft_argmax3d

kornia.geometry.subpix.spatial_soft_argmax.spatial_soft_argmax2d

kornia.geometry.subpix.spatial_soft_argmax.conv_quad_interp3d

kornia.geometry.transform.crop.crop2d.crop_by_boxes

kornia.geometry.transform.crop.crop2d.crop_by_transform_mat

kornia.geometry.transform.crop.crop3d.crop_by_boxes3d

kornia.geometry.transform.crop.crop3d.crop_by_transform_mat3d

kornia.geometry.transform.affwarp.affine

kornia.geometry.transform.affwarp.affine3d

kornia.geometry.transform.affwarp.rotate

kornia.geometry.transform.affwarp.rotate3d

kornia.geometry.transform.affwarp.translate

kornia.geometry.transform.affwarp.scale

kornia.geometry.transform.affwarp.shear

kornia.geometry.transform.affwarp.rescale

kornia.geometry.transform.elastic_transform.elastic_transform2d

kornia.geometry.transform.homography_warper.homography_warp

kornia.geometry.transform.homography_warper.homography_warp3d

kornia.geometry.transform.imgwarp.warp_perspective

kornia.geometry.transform.imgwarp.warp_affine

kornia.geometry.transform.imgwarp.remap

kornia.geometry.transform.projwarp.warp_affine3d

kornia.geometry.transform.projwarp.warp_perspective3d

kornia.geometry.transform.pyramid.build_pyramid

kornia.geometry.transform.thin_plate_spline.warp_image_tps

kornia.morphology.morphology.dilation

kornia.morphology.morphology.erosion

kornia.morphology.morphology.opening

kornia.morphology.morphology.closing

kornia.morphology.morphology.gradient

kornia.morphology.morphology.top_hat

kornia.morphology.morphology.bottom_hat

Alternatives

Additional context

Simple modules may be changed to work with any dimension using ellipsis .... Others depending on torch functionalities may require rehsaping back and forth.

Please feel free to add or remove anything to/in the list. I can make a PR for this.

Issue Analytics

State:
Created 2 years ago
Comments:6 (6 by maintainers)

Top GitHub Comments

1reaction

justanhduccommented, Jun 9, 2021

Hi @edgarriba @lferraz

@justanhduc what do you suggest for this ? A decorator might work so that we can enforce all the image based operators to work with (..., H, W). Or at least, return the same the shape as in the input.

Yeah as mentioned in “Additional context”, I think there are two ways. Either adding decorator like what has been done with kornia.geometry.transform.affwarp.resize, or in easier cases, we can use ... to index into tensors.

Please, correct me if I am wrong. I feel the only real problem is that videos require a format that breaks the compatibility with the images.

I think so too. The extra time dimension changes everything. However I think you might misunderstood this issue. I don’t propose to make the same functions work for both videos and images, but I want current functions for videos to work with videos of arbitrary dimensions of the form ..., D, H, W, and those for images ..., H, W. Correct me if I’m wrong.

In my opinion there are 2 approaches to solve this issue. The most efficient is changing the video format to be compatible with the image one. The other approach requires adding logic everywhere. That logic can be added using decorators but take into account that it is an extra cost (similar to the issue of using asserts instead of ifs, I do not remember the number sorry).

Regarding extra overhead cost, IMO it should not be much different because in most functions there are already a bunch of checks to make sure the input is in correct form. If we add a decorator, it would do nothing if the input is already in the right form, and only reshape when the input is not. Also I think by adding decorator, we can hide a bunch of checks inside it, making the main code more readable.

0reactions

justanhduccommented, Jun 9, 2021

@edgarriba Ok then I will start with the submodules in kornia.enhance.adjust first. Also I will make a WIP PR so we can discuss there.

Top Results From Across the Web

Dimensional Function - an overview | ScienceDirect Topics

An image may be defined as a two-dimensional function f(x,y) where x and y represent the spatial coordinates and the function (f) represents...

Vector fields (article) - Khan Academy

You can think of a vector field as representing a multivariable function whose input and output spaces each have the same dimension.

Vector fields, introduction (video) - Khan Academy

Vector fields let you visualize a function with a two- dimensional input and a two- dimensional output. You end up with, well, a...

Image Processing with Python - Data Carpentry

With this lesson, we aim to provide a thorough grounding in the fundamental concepts and skills of working with image data in Python....

Generic Geometric Transformations - MATLAB & Simulink

You can create custom geometric transformations to process images of arbitrary dimension, or to change the dimensionality of the output image from the...