Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

`kxd` matrix or `1xd` vector?

See original GitHub issue

In section 3 of paper ‘Augmenting Convolutional networks with attention-based aggregation’: ··· We can easily specialize the attention maps per class by replacing the CLS vector with a k × d matrix, where each of the k columns is associated with one of the classes. This specialization allows us to visualize an attention map for each class, as shown in Figure 2. ··· But I only found 1 x d vector. Where is k x d matrix?

https://github.com/facebookresearch/deit/blob/40ae72b79cc5cd48dac2b02e1fceb03ee4192676/patchconvnet_models.py#L201

Issue Analytics

State:
Created 2 years ago
Comments:6 (4 by maintainers)

Top GitHub Comments

1reaction

jegoucommented, Jan 1, 2022

This design is introduced as a straightforward variation of our learned aggregation layer. And yes: mostly for vizualization.

In Figure 1 most of our results are with 1 token. I am sorry if you think you find we have spend ‘much time’, I actually thought that we were presenting it as a straightforward variation of our main proposal: only the sentence ‘We can specialize this attention per class’ in Figure 1 for the last column, and a small paragraph in Section 3, where we point out some limitations of this variant. Maybe the confusing part is Fig. 3, where we could easily explain the tiny difference between both choices.

So yes, our paper has a significant focus on how to provide vizualisation and some interpretable aggregation layer to convnets, this is also true for the single-token version.

The choice of the single token is driven by a more direct relationship to this objective, as it provides the weight of each patch in the aggregation. The one-class-token provides a weight per patch for each class, independent of whether the class is selected or not by the classifier. While both have their pros and cons, we feel like they are complementary for vizualisation purposes.

Hervé

0reactions

densechencommented, Jan 8, 2022

@TouvronHugo Thanks for your kindly reply. Your detailed explanation makes me know better about the whole process. Thanks!

Top Results From Across the Web

k-means-mog

K-means is inappropriate when either data given as a proximity matrix or an average over data-points xn is not meaningful way to calculate...

OpenCV: append different vectors as one row edit

I have a cv::Mat1f vector which size is kxd . How can I fill it by appending k different 1xd vectors? I want...

Part 1: Vectors and matrix basics

In this series of posts we describe some basic ideas regarding vectors and matrices (also called 'arrays') that are fundamental to understanding machine ......

Convert 1D array in to row or column vector in Numpy

This formula remains valid for vectors if we assume that the row vector is a matrix of dimension (1, N), and the column...

12.1.1 Matrices and Vectors Definition of Matrix. An MxN ...

An MxN matrix A is a two-dimensional array of numbers ... A column vector is a Bx1 matrix and a row vector is...