morphology.dilation: inconsistency between versions.
See original GitHub issue🐛 Bug
hi,
‘kornia.morphology.dilation’ seems to have changed between versions 0.5.3+3a7a77a
(git+https://github.com/kornia/kornia@3a7a77a42b1137068bc7945a6745ab59438e94e2) and 0.4.2+ac6a04b
(git+https://github.com/kornia/kornia@ac6a04b002103ced2fa2983930b47f87eaded15d
).
over “binary” tensors (float tensors filled with 0/1), the version 0.4.2+ac6a04b
seems to give the right output (correct dilated tensor filled with 0/1). but, in version 0.5.3+3a7a77a
, the output is a tensor filled with 1/2. this tensor could be brought to the right output as in 0.4.2+ac6a04b
by subtracting 1.
i suspect that the behavior of 0.5.3+3a7a77a
is due to the application of dilation over non-binary tensor as in grey images which takes the sup over the sum of the kernel and the image patch as in wiki.
while version 0.5.3+3a7a77a
is fine. the doc should indicate this behavior on binary tensors.
coming from 0.4.2+ac6a04b
, i did not expect this behavior in 0.5.3+3a7a77a
.
To Reproduce
Steps to reproduce the behavior:
code
:
import torch
from kornia.morphology import dilation
seed = 0
torch.manual_seed(seed)
a = torch.rand(1, 1, 5, 5)
print(a)
a = (a > 0.7).float()
kernel = torch.ones(3, 3)
out = dilation(a, kernel)
print(a)
print(out)
Expected behavior (behaviors of the 2 versions)
version 0.4.2+ac6a04b
:
In [11]: print(a)
tensor([[[[0.4963, 0.7682, 0.0885, 0.1320, 0.3074],
[0.6341, 0.4901, 0.8964, 0.4556, 0.6323],
[0.3489, 0.4017, 0.0223, 0.1689, 0.2939],
[0.5185, 0.6977, 0.8000, 0.1610, 0.2823],
[0.6816, 0.9152, 0.3971, 0.8742, 0.4194]]]])
In [12]: a = (a > 0.7).float()
In [13]: kernel = torch.ones(3, 3)
In [14]: out = dilation(a, kernel)
In [15]: print(a)
tensor([[[[0., 1., 0., 0., 0.],
[0., 0., 1., 0., 0.],
[0., 0., 0., 0., 0.],
[0., 0., 1., 0., 0.],
[0., 1., 0., 1., 0.]]]])
In [16]: print(out)
tensor([[[[1., 1., 1., 1., 0.],
[1., 1., 1., 1., 0.],
[0., 1., 1., 1., 0.],
[1., 1., 1., 1., 1.],
[1., 1., 1., 1., 1.]]]])
version 0.5.3+3a7a77a
:
print(a) # continuous
tensor([[[[0.4963, 0.7682, 0.0885, 0.1320, 0.3074],
[0.6341, 0.4901, 0.8964, 0.4556, 0.6323],
[0.3489, 0.4017, 0.0223, 0.1689, 0.2939],
[0.5185, 0.6977, 0.8000, 0.1610, 0.2823],
[0.6816, 0.9152, 0.3971, 0.8742, 0.4194]]]])
print(a) # `binarized`
tensor([[[[0., 1., 0., 0., 0.],
[0., 0., 1., 0., 0.],
[0., 0., 0., 0., 0.],
[0., 0., 1., 0., 0.],
[0., 1., 0., 1., 0.]]]])
print(out)
tensor([[[[2., 2., 2., 2., 1.],
[2., 2., 2., 2., 1.],
[1., 2., 2., 2., 1.],
[2., 2., 2., 2., 2.],
[2., 2., 2., 2., 2.]]]])
print(out - 1)
tensor([[[[1., 1., 1., 1., 0.],
[1., 1., 1., 1., 0.],
[0., 1., 1., 1., 0.],
[1., 1., 1., 1., 1.],
[1., 1., 1., 1., 1.]]]])
source code
version 0.4.2+ac6a04b
:
# _se_to_mask
def _se_to_mask(se: torch.Tensor) -> torch.Tensor:
se_h, se_w = se.size()
se_flat = se.view(-1)
num_feats = se_h * se_w
out = torch.zeros(num_feats, 1, se_h, se_w, dtype=se.dtype, device=se.device)
for i in range(num_feats):
y = i % se_h
x = i // se_h
out[i, 0, x, y] = (se_flat[i] >= 0).float()
return out
def dilation(tensor: torch.Tensor, kernel: torch.Tensor) -> torch.Tensor:
r"""Returns the dilated image applying the same kernel in each channel.
The kernel must have 2 dimensions, each one defined by an odd number.
Args:
tensor (torch.Tensor): Image with shape :math:`(B, C, H, W)`.
kernel (torch.Tensor): Structuring element with shape :math:`(H, W)`.
Returns:
torch.Tensor: Dilated image with shape :math:`(B, C, H, W)`.
Example:
>>> tensor = torch.rand(1, 3, 5, 5)
>>> kernel = torch.ones(3, 3)
>>> dilated_img = dilation(tensor, kernel)
"""
if not isinstance(tensor, torch.Tensor):
raise TypeError("Input type is not a torch.Tensor. Got {}".format(
type(tensor)))
if len(tensor.shape) != 4:
raise ValueError("Input size must have 4 dimensions. Got {}".format(
tensor.dim()))
if not isinstance(kernel, torch.Tensor):
raise TypeError("Kernel type is not a torch.Tensor. Got {}".format(
type(kernel)))
if len(kernel.shape) != 2:
raise ValueError("Kernel size must have 2 dimensions. Got {}".format(
kernel.dim()))
# prepare kernel
se_d: torch.Tensor = kernel - 1.
kernel_d: torch.Tensor = _se_to_mask(se_d)
# pad
se_h, se_w = kernel.shape
pad_d: List[int] = [se_h // 2, se_w // 2]
output: torch.Tensor = tensor.view(
tensor.shape[0] * tensor.shape[1], 1, tensor.shape[2], tensor.shape[3])
output = (F.conv2d(output, kernel_d, padding=pad_d) + se_d.view(1, -1, 1, 1)).max(dim=1)[0]
return output.view_as(tensor)
version 0.5.3+3a7a77a
:
def dilation(tensor: torch.Tensor, kernel: torch.Tensor, origin: Optional[List[int]] = None) -> torch.Tensor:
r"""Returns the dilated image applying the same kernel in each channel.
The kernel must have 2 dimensions.
Args:
tensor (torch.Tensor): Image with shape :math:`(B, C, H, W)`.
kernel (torch.Tensor): Structuring element with shape :math:`(k_x, k_y)`.
origin (List[int], Tuple[int, int]): Origin of the structuring element. Default is None and uses the center of
the structuring element as origin (rounding towards zero).
Returns:
torch.Tensor: Dilated image with shape :math:`(B, C, H, W)`.
Example:
>>> tensor = torch.rand(1, 3, 5, 5)
>>> kernel = torch.ones(3, 3)
>>> dilated_img = dilation(tensor, kernel)
"""
if not isinstance(tensor, torch.Tensor):
raise TypeError("Input type is not a torch.Tensor. Got {}".format(type(tensor)))
if len(tensor.shape) != 4:
raise ValueError("Input size must have 4 dimensions. Got {}".format(tensor.dim()))
if not isinstance(kernel, torch.Tensor):
raise TypeError("Kernel type is not a torch.Tensor. Got {}".format(type(kernel)))
if len(kernel.shape) != 2:
raise ValueError("Kernel size must have 2 dimensions. Got {}".format(kernel.dim()))
# origin
se_h, se_w = kernel.shape
if origin is None:
origin = [se_h // 2, se_w // 2]
# pad
pad_e: List[int] = [origin[1], se_w - origin[1] - 1, origin[0], se_h - origin[0] - 1]
output: torch.Tensor = F.pad(tensor, pad_e, mode='constant', value=0.0)
# computation
output = output.unfold(2, se_h, 1).unfold(3, se_w, 1)
output, _ = torch.max(output + kernel.flip((0, 1)), 4)
output, _ = torch.max(output, 4)
return output
Environment
version 0.4.2+ac6a04b
:
$ python collect_env.py
Collecting environment information...
PyTorch version: 1.7.0
Is debug build: True
CUDA used to build PyTorch: 11.0
ROCM used to build PyTorch: N/A
OS: Ubuntu 16.04.4 LTS (x86_64)
GCC version: (Ubuntu 5.4.1-2ubuntu1~16.04) 5.4.1 20160904
Clang version: Could not collect
CMake version: version 3.5.1
Libc version: glibc-2.10
Python version: 3.7 (64-bit runtime)
Python platform: Linux-4.4.0-57-generic-x86_64-with-debian-stretch-sid
Is CUDA available: True
CUDA runtime version: 10.0.130
GPU models and configuration:
GPU 0: Tesla P100-PCIE-16GB
GPU 1: Tesla P100-PCIE-16GB
Nvidia driver version: 455.32.00
cuDNN version: Probably one of the following:
/usr/lib/x86_64-linux-gnu/libcudnn.so.7.3.0
/usr/local/cuda-8.0/targets/x86_64-linux/lib/libcudnn.so.5.1.5
/usr/local/cuda-8.0/targets/x86_64-linux/lib/libcudnn.so.6.0.21
HIP runtime version: N/A
MIOpen runtime version: N/A
Versions of relevant libraries:
[pip3] numpy==1.19.2
[pip3] torch==1.7.0
[pip3] torchvision==0.8.0
[conda] blas 1.0 mkl
[conda] cudatoolkit 11.0.3 h15472ef_6 conda-forge
[conda] mkl 2020.4 h726a3e6_304 conda-forge
[conda] mkl-service 2.3.0 py37h8f50634_2 conda-forge
[conda] mkl_fft 1.2.0 py37h161383b_1 conda-forge
[conda] mkl_random 1.2.0 py37h9fdb41a_1 conda-forge
[conda] numpy 1.16.2 pypi_0 pypi
[conda] numpy-base 1.19.2 py37hfa32c7d_0
[conda] pytorch 1.7.0 py3.7_cuda11.0.221_cudnn8.0.3_0 pytorch
[conda] torchvision 0.8.0 py37_cu110 pytorch
version 0.5.3+3a7a77a
:
$ python collect_env.py
Collecting environment information...
PyTorch version: 1.7.1
Is debug build: False
CUDA used to build PyTorch: 11.0
ROCM used to build PyTorch: N/A
OS: Ubuntu 18.04.5 LTS (x86_64)
GCC version: (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
Clang version: Could not collect
CMake version: Could not collect
Libc version: glibc-2.10
Python version: 3.7 (64-bit runtime)
Python platform: Linux-4.15.0-122-generic-x86_64-with-debian-buster-sid
Is CUDA available: True
CUDA runtime version: 10.0.130
GPU models and configuration:
GPU 0: Tesla P100-PCIE-16GB
GPU 1: Tesla P100-PCIE-16GB
Nvidia driver version: 455.32.00
cuDNN version: /usr/lib/x86_64-linux-gnu/libcudnn.so.7.4.2
HIP runtime version: N/A
MIOpen runtime version: N/A
Versions of relevant libraries:
[pip3] efficientnet-pytorch==0.7.0
[pip3] numpy==1.20.1
[pip3] torch==1.8.0
[pip3] torchvision==0.9.0
[conda] blas 1.0 mkl
[conda] cudatoolkit 11.0.221 h6bb024c_0
[conda] efficientnet-pytorch 0.7.0 pypi_0 pypi
[conda] mkl 2020.2 256
[conda] mkl-service 2.3.0 py37he8ac12f_0
[conda] mkl_fft 1.3.0 py37h54f3939_0
[conda] mkl_random 1.1.1 py37h0573a6f_0
[conda] numpy 1.20.1 pypi_0 pypi
[conda] numpy-base 1.19.2 py37hfa32c7d_0
[conda] pytorch 1.7.1 py3.7_cuda11.0.221_cudnn8.0.5_0 pytorch
[conda] torch 1.8.0 pypi_0 pypi
[conda] torchvision 0.9.0 pypi_0 pypi
sorry that both envs are different (they are on different servers in different virtual envs). but i think the change in the behavior is due to the change of the code of the dilation function.
thanks
Additional context
Issue Analytics
- State:
- Created 2 years ago
- Comments:5 (3 by maintainers)
Top GitHub Comments
hi all, thanks. i tested the new version
0.5.5+9a70e35
, and it gives the correct output without rectification.thanks
Hi there!
@Manza12 , in case of refactoring maybe I can help you with some code that I made when I created the morphology module.
My first approach was .unfold (as NVIDIA forum says, is the best way to implement new stuff).
For example, dilation:
But after doing some benchmarks, the conclusion is that the convolution method spends a little bit less GPU memory, so this was the final approach (as discused in the morphology channel in slack, that unfortunately does not exists any more).
If you want to refactor, I would suggest you make some benchmarks and check if you can outperform the convolution approach.
Ps: If you want to check the unfolded erosion I have it too.