The document of torchvision.ops.deform_conv2d is not clear
See original GitHub issue📚 Documentation
From the documentation, I cannot get the exact meaning of 18(ie, 233) channels of the offset in a deformable convolution?
I want to visualize the offset of the deformable convolution with kernel size 3*3. So It’s essential for me to know what’s the exact meaning of these channels.
I write down something possible here:
upper-left: ul
upper-right: ur
bottom-left: bl
bottom-right: br
up: u
bottom: b
right: r
left: l
center: c
possible offset layout (maybe not correct):
delta_ul_x, delta_ul_y, delta_u_x, delta_u_y, delta_ur_x, delta_ur_y;
delta_l_x, delta_l_y, delta_c_x, delta_c_y, delta_r_x, delta_r_y;
delta_bl_x, delta_bl_y, delta_b_x, delta_b_y, delta_br_x, delta_br_y;
Issue Analytics
- State:
- Created 2 years ago
- Reactions:2
- Comments:12 (7 by maintainers)
Top Results From Across the Web
deform_conv2d — Torchvision main documentation
Performs Deformable Convolution v2, described in Deformable ConvNets v2: More Deformable, Better Results if mask is not None and Performs Deformable ...
Read more >Despite installing the torch vision pytorch library, I am ...
Despite installing the torch vision pytorch library, I am getting an error saying that there is no module named torch vision · Ask...
Read more >When I in pytorch on the official website to install torch and ...
If I try to use the PIP link given on the pytorch official website to install torch, I call torch to check that...
Read more >Five Implementation Strategies of Spatial-Shift-Operation in ...
When constructing the offset, we have to make it clear are in pairs, and each group contains the relative offset along the H...
Read more >deform-conv2d-onnx-exporter 1.1.0 on PyPI
The detail of deform_conv2d implementation in PyTorch is not fully documented. Therefore, I investigated the implementation to understand memory ...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Maybe some comments should be added for this. What do you think? @NicolasHug
I am not a developer, but I think this might be handled with a fixed internal flatten operation, which can handle both inputs?
Personally, I think stating the exact order of elements encoded in the dimension “2 * offset_groups * kernel_height * kernel_width” in the docs would be sufficient, I like the functional approach of the current version.
Assuming the order: T in groups x kernel_height x kernel_width x [offset_h, offset_w] then stating that the “flattened tensor” to pass to the function will be: [T[0,0,0,0], T[0,0,0,1], T[0,0,1,0], T[0,0,1,1],…]
If this assumption is correct, for clarity, the docs should state: (Tensor[batch_size, offset_groups * kernel_height * kernel_width * 2, (offset) – out_height, out_width]): offsets to be applied for each position in the convolution kernel.