Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

The document of torchvision.ops.deform_conv2d is not clear

See original GitHub issue

📚 Documentation

From the documentation, I cannot get the exact meaning of 18(ie, 233) channels of the offset in a deformable convolution?

I want to visualize the offset of the deformable convolution with kernel size 3*3. So It’s essential for me to know what’s the exact meaning of these channels.

I write down something possible here:

upper-left: ul
upper-right: ur
bottom-left: bl
bottom-right: br
up: u
bottom: b
right: r
left: l
center: c

possible offset layout (maybe not correct):
delta_ul_x, delta_ul_y,   delta_u_x, delta_u_y,     delta_ur_x, delta_ur_y;
delta_l_x, delta_l_y,       delta_c_x, delta_c_y,      delta_r_x, delta_r_y;
delta_bl_x, delta_bl_y,   delta_b_x, delta_b_y,     delta_br_x, delta_br_y;

Issue Analytics

State:
Created 2 years ago
Reactions:2
Comments:12 (7 by maintainers)

Top GitHub Comments

2reactions

voldemortXcommented, Apr 27, 2021

Maybe some comments should be added for this. What do you think? @NicolasHug

1reaction

dariofuolicommented, Jun 2, 2021

I think that could introduce a BC-Break of some sort? Personally, I think maybe if deformable conv could be implemented as a pytorch layer, things would be much easier…

I am not a developer, but I think this might be handled with a fixed internal flatten operation, which can handle both inputs?

Personally, I think stating the exact order of elements encoded in the dimension “2 * offset_groups * kernel_height * kernel_width” in the docs would be sufficient, I like the functional approach of the current version.

Assuming the order: T in groups x kernel_height x kernel_width x [offset_h, offset_w] then stating that the “flattened tensor” to pass to the function will be: [T[0,0,0,0], T[0,0,0,1], T[0,0,1,0], T[0,0,1,1],…]

If this assumption is correct, for clarity, the docs should state: (Tensor[batch_size, offset_groups * kernel_height * kernel_width * 2, (offset) – out_height, out_width]): offsets to be applied for each position in the convolution kernel.

Top Results From Across the Web

deform_conv2d — Torchvision main documentation

Performs Deformable Convolution v2, described in Deformable ConvNets v2: More Deformable, Better Results if mask is not None and Performs Deformable ...

Despite installing the torch vision pytorch library, I am ...

Despite installing the torch vision pytorch library, I am getting an error saying that there is no module named torch vision · Ask...

When I in pytorch on the official website to install torch and ...

If I try to use the PIP link given on the pytorch official website to install torch, I call torch to check that...

Five Implementation Strategies of Spatial-Shift-Operation in ...

When constructing the offset, we have to make it clear are in pairs, and each group contains the relative offset along the H...

deform-conv2d-onnx-exporter 1.1.0 on PyPI

The detail of deform_conv2d implementation in PyTorch is not fully documented. Therefore, I investigated the implementation to understand memory ...