AdaptiveAvgPool2D layer
See original GitHub issueShort Description
At Hugging Face we’ve seen a few PyTorch vision transformer models using AdaptiveAvgPool2D
. In a lot of cases these are just resizing to (1,)
or (1, 1)
, in which case they’re just a strange way to compute torch.mean()
, but in some cases they’re actually using this layer’s full functionality.
The problem with AdaptiveAvgPool2D
is that it computes the pooling windows in a unique way, and the size of the windows can be variable. This makes it impossible to implement with a standard pooling layer, and very annoying to port to TF, especially if you want to load weights from a model that was trained with the Torch layer. There is an implementation in tensorflow-addons but it uses fixed size windows, and so does not match the output of the Torch layer unless the input size is an integer multiple of the output size.
We made a reasonably performant TF version that does correctly match the Torch layer in all cases - do you need this in keras-cv
? We’d be happy to add it as a PR if it’s useful.
Existing Implementations Torch layer
Other information See this StackOverflow post for a good description of how the Torch layer works internally.
Issue Analytics
- State:
- Created a year ago
- Reactions:2
- Comments:15 (1 by maintainers)
Top GitHub Comments
Hi @innat I could, yes! If you look at the gist I linked above, the
pseudo_1d_pool
function is basically just a 1DAdaptivePool
, so that would be very easy to implement as a separate layer. To do a 3D pool I would just do a 1D pool on each of the 3 dimensions.@Rocketknight1 Do you also implement 1D and 3D version of this layer?