question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Add parameter for Deformable Convolution offset group scalar value

See original GitHub issue

🚀 Feature

Currently, the scalar used to calculate the number of deformable groups is hardcoded at 2. I would like for a parameter to be added that allows this number to be anything in order to have compatibility with repositories such as EDVR which use 3 for this value.

I have already added it myself and was going to submit a PR before reading that I should submit an issue first.

Motivation

I am currently trying to replace the MMdetection Deformable Convolution v2 with the Torchvision one for the EDVR repository. However, for its offsets, it calculates the out_nc size using this formula: self.deformable_groups * 3 * self.kernel_size[0] * self.kernel_size[1]. The usual formula, which the current Torchvision implementation expects, is self.deformable_groups * 2 * self.kernel_size[0] * self.kernel_size[1]. As you can see, they use a 3 in this calculation instead of a 2. I’m not entirely sure why, but it doesn’t work unless it uses 3.

This causes an issue when using the Torchvision implementation, as in order to calculate the number of offset groups (called deformable groups in the formula above), it requires that scalar value to be 2.

EDVR Formula

Torchvision Formula

Pitch

I would like for a parameter to be added that would allow me to change this value, like so.

Alternatives

Another alternative could be to allow the number of offset groups to be passed in instead of being auto-calculated, as that is what the MMDetection version does.

Additional context

None.

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Reactions:1
  • Comments:8 (4 by maintainers)

github_iconTop GitHub Comments

2reactions
NicolasHugcommented, Jun 1, 2021

I agree with @fmassa that the 2 in torchvision’s implementation refers to the h and w dimensions.

From section 3.2 of the DeformConv v2 paper https://arxiv.org/abs/1811.11168:

The output is of 3 K channels, where the first 2K channels correspond to the learned offsets ∆pk, and the remaining K channels are further fed to a sigmoid layer to obtain the modulation scalars ∆mk.

where K is self.kernel_size[0] * self.kernel_size[1]. So the difference between 2 and 3 seems to come from the modulation scalars.

I could be wrong as I’m not super familiar with the paper nor the implementation, but I believe those modulation scalars actually correspond to the mask parameter.

I’ll close the issue, please feel free to re-open if there are still some doubts.

1reaction
fmassacommented, Jan 28, 2021

@JoeyBallentine ok, so from my understanding then this was a user error as the shapes of offsets and masks were not correct, so we couldn’t properly infer the number of offset groups.

BTW, I would not recommend calling directly through the torch.ops.torchvision.deform_conv2d as it is an implementation detail and can change at any time without notice. So it might be preferable to fix the code upstream then on relying on internal implementations

Read more comments on GitHub >

github_iconTop Results From Across the Web

Deformable Convolutions Demystified | by Divyanshu Mishra
The Deformable Convolution operation is depicted by the equation below where Δpₙ denotes the offsets added to the normal convolution operation.
Read more >
DeformableConvolution - OpenVINO™ Documentation
Description : deformable_group is the number of groups in which offsets input and output should be split into along the channel axis. Apply...
Read more >
Offset-Adjustable Deformable Convolution and Region ...
The deformable convolution technique enables the layer to sample the feature map in alterable locations by 2D offsets added to the grid ...
Read more >
KPConv: Flexible and Deformable Convolution for Point Clouds
KPConv illustrated on 2D points. Input points with a constant scalar feature (in grey) are convolved through a KPConv that is defined by...
Read more >
Functions — Neural Network Libraries 1.32.0 documentation
offset (Variable) – Offsets for deformable convolutions. Shape is fixed to ( N ... clip_norm (Variable or float) – An input scalar variable...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found