Make functions work with arbitrary-dimensional images and videos
See original GitHub issue🚀 Feature
To make most functionals work with images and videos of any dimensions.
Motivation
Most functions require input to be of specific shapes. This is quite restricted and deep learning-centric. I would like to reduce this restriction and make them work with inputs of any dimensions. This will agree with torchvision, kornia.color module and this PR.
Pitch
So far, the following functions can be converted to support this
kornia.enhance.adjust.solarize
kornia.enhance.adjust.posterize
kornia.enhance.adjust.sharpness
kornia.enhance.adjust.equalize
kornia.enhance.adjust.equalize3d
kornia.enhance.core.add_weighted
kornia.enhance.equalization.equalize_clahe
kornia.enhance.normalize.normalize_min_max
kornia.filters.blur.box_blur
kornia.filters.blur.blur_pool2d
kornia.filters.blur.max_blur_pool2d
kornia.filters.blur.canny
kornia.filters.filter.filter2D
kornia.filters.filter.filter3D
kornia.filters.laplacian.laplacian
kornia.filters.median.median_blur
kornia.filters.motion.motion_blur3d
kornia.filters.motion.sobel.spatial_gradient
kornia.filters.motion.sobel.spatial_gradient3d
kornia.filters.motion.sobel.sobel
kornia.filters.unsharp.unsharp_mask
kornia.geometry.subpix.dsnt.spatial_softmax2d
kornia.geometry.subpix.dsnt.spatial_expectation2d
kornia.geometry.subpix.spatial_soft_argmax.conv_soft_argmax2d
kornia.geometry.subpix.spatial_soft_argmax.conv_soft_argmax3d
kornia.geometry.subpix.spatial_soft_argmax.spatial_soft_argmax2d
kornia.geometry.subpix.spatial_soft_argmax.conv_quad_interp3d
kornia.geometry.transform.crop.crop2d.crop_by_boxes
kornia.geometry.transform.crop.crop2d.crop_by_transform_mat
kornia.geometry.transform.crop.crop3d.crop_by_boxes3d
kornia.geometry.transform.crop.crop3d.crop_by_transform_mat3d
kornia.geometry.transform.affwarp.affine
kornia.geometry.transform.affwarp.affine3d
kornia.geometry.transform.affwarp.rotate
kornia.geometry.transform.affwarp.rotate3d
kornia.geometry.transform.affwarp.translate
kornia.geometry.transform.affwarp.scale
kornia.geometry.transform.affwarp.shear
kornia.geometry.transform.affwarp.rescale
kornia.geometry.transform.elastic_transform.elastic_transform2d
kornia.geometry.transform.homography_warper.homography_warp
kornia.geometry.transform.homography_warper.homography_warp3d
kornia.geometry.transform.imgwarp.warp_perspective
kornia.geometry.transform.imgwarp.warp_affine
kornia.geometry.transform.imgwarp.remap
kornia.geometry.transform.projwarp.warp_affine3d
kornia.geometry.transform.projwarp.warp_perspective3d
kornia.geometry.transform.pyramid.build_pyramid
kornia.geometry.transform.thin_plate_spline.warp_image_tps
kornia.morphology.morphology.dilation
kornia.morphology.morphology.erosion
kornia.morphology.morphology.opening
kornia.morphology.morphology.closing
kornia.morphology.morphology.gradient
kornia.morphology.morphology.top_hat
kornia.morphology.morphology.bottom_hat
Alternatives
Additional context
Simple modules may be changed to work with any dimension using ellipsis .... Others depending on torch functionalities may require rehsaping back and forth.
Please feel free to add or remove anything to/in the list. I can make a PR for this.
Issue Analytics
- State:
- Created 2 years ago
- Comments:6 (6 by maintainers)

Top Related StackOverflow Question
Hi @edgarriba @lferraz
Yeah as mentioned in “Additional context”, I think there are two ways. Either adding decorator like what has been done with
kornia.geometry.transform.affwarp.resize, or in easier cases, we can use...to index into tensors.I think so too. The extra time dimension changes everything. However I think you might misunderstood this issue. I don’t propose to make the same functions work for both videos and images, but I want current functions for videos to work with videos of arbitrary dimensions of the form
..., D, H, W, and those for images..., H, W. Correct me if I’m wrong.Regarding extra overhead cost, IMO it should not be much different because in most functions there are already a bunch of checks to make sure the input is in correct form. If we add a decorator, it would do nothing if the input is already in the right form, and only reshape when the input is not. Also I think by adding decorator, we can hide a bunch of checks inside it, making the main code more readable.
@edgarriba Ok then I will start with the submodules in
kornia.enhance.adjustfirst. Also I will make a WIP PR so we can discuss there.