Make functions work with arbitrary-dimensional images and videos
See original GitHub issue🚀 Feature
To make most functionals work with images and videos of any dimensions.
Motivation
Most functions require input to be of specific shapes. This is quite restricted and deep learning-centric. I would like to reduce this restriction and make them work with inputs of any dimensions. This will agree with torchvision
, kornia.color
module and this PR.
Pitch
So far, the following functions can be converted to support this
kornia.enhance.adjust.solarize
kornia.enhance.adjust.posterize
kornia.enhance.adjust.sharpness
kornia.enhance.adjust.equalize
kornia.enhance.adjust.equalize3d
kornia.enhance.core.add_weighted
kornia.enhance.equalization.equalize_clahe
kornia.enhance.normalize.normalize_min_max
kornia.filters.blur.box_blur
kornia.filters.blur.blur_pool2d
kornia.filters.blur.max_blur_pool2d
kornia.filters.blur.canny
kornia.filters.filter.filter2D
kornia.filters.filter.filter3D
kornia.filters.laplacian.laplacian
kornia.filters.median.median_blur
kornia.filters.motion.motion_blur3d
kornia.filters.motion.sobel.spatial_gradient
kornia.filters.motion.sobel.spatial_gradient3d
kornia.filters.motion.sobel.sobel
kornia.filters.unsharp.unsharp_mask
kornia.geometry.subpix.dsnt.spatial_softmax2d
kornia.geometry.subpix.dsnt.spatial_expectation2d
kornia.geometry.subpix.spatial_soft_argmax.conv_soft_argmax2d
kornia.geometry.subpix.spatial_soft_argmax.conv_soft_argmax3d
kornia.geometry.subpix.spatial_soft_argmax.spatial_soft_argmax2d
kornia.geometry.subpix.spatial_soft_argmax.conv_quad_interp3d
kornia.geometry.transform.crop.crop2d.crop_by_boxes
kornia.geometry.transform.crop.crop2d.crop_by_transform_mat
kornia.geometry.transform.crop.crop3d.crop_by_boxes3d
kornia.geometry.transform.crop.crop3d.crop_by_transform_mat3d
kornia.geometry.transform.affwarp.affine
kornia.geometry.transform.affwarp.affine3d
kornia.geometry.transform.affwarp.rotate
kornia.geometry.transform.affwarp.rotate3d
kornia.geometry.transform.affwarp.translate
kornia.geometry.transform.affwarp.scale
kornia.geometry.transform.affwarp.shear
kornia.geometry.transform.affwarp.rescale
kornia.geometry.transform.elastic_transform.elastic_transform2d
kornia.geometry.transform.homography_warper.homography_warp
kornia.geometry.transform.homography_warper.homography_warp3d
kornia.geometry.transform.imgwarp.warp_perspective
kornia.geometry.transform.imgwarp.warp_affine
kornia.geometry.transform.imgwarp.remap
kornia.geometry.transform.projwarp.warp_affine3d
kornia.geometry.transform.projwarp.warp_perspective3d
kornia.geometry.transform.pyramid.build_pyramid
kornia.geometry.transform.thin_plate_spline.warp_image_tps
kornia.morphology.morphology.dilation
kornia.morphology.morphology.erosion
kornia.morphology.morphology.opening
kornia.morphology.morphology.closing
kornia.morphology.morphology.gradient
kornia.morphology.morphology.top_hat
kornia.morphology.morphology.bottom_hat
Alternatives
Additional context
Simple modules may be changed to work with any dimension using ellipsis ...
. Others depending on torch
functionalities may require rehsaping back and forth.
Please feel free to add or remove anything to/in the list. I can make a PR for this.
Issue Analytics
- State:
- Created 2 years ago
- Comments:6 (6 by maintainers)
Top GitHub Comments
Hi @edgarriba @lferraz
Yeah as mentioned in “Additional context”, I think there are two ways. Either adding decorator like what has been done with
kornia.geometry.transform.affwarp.resize
, or in easier cases, we can use...
to index into tensors.I think so too. The extra time dimension changes everything. However I think you might misunderstood this issue. I don’t propose to make the same functions work for both videos and images, but I want current functions for videos to work with videos of arbitrary dimensions of the form
..., D, H, W
, and those for images..., H, W
. Correct me if I’m wrong.Regarding extra overhead cost, IMO it should not be much different because in most functions there are already a bunch of checks to make sure the input is in correct form. If we add a decorator, it would do nothing if the input is already in the right form, and only reshape when the input is not. Also I think by adding decorator, we can hide a bunch of checks inside it, making the main code more readable.
@edgarriba Ok then I will start with the submodules in
kornia.enhance.adjust
first. Also I will make a WIP PR so we can discuss there.