Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

error in self attention visualisation

See original GitHub issue

I beleive there is a error for idexing the self attetion:

for idx_o, ax in zip(idxs, axs):
    idx = (idx_o[0] // fact, idx_o[1] // fact)
    ax.imshow(sattn[..., idx[0], idx[1]], cmap='cividis', interpolation='nearest')
    ax.axis('off')
    ax.set_title(f'self-attention{idx_o}')

should be sattn[idx[0], idx[1], ...]as the last two dim sums to 1, so the first two are locations and are the dim to indexing, with this the attention map is

Issue Analytics

State:
Created 3 years ago
Comments:9 (3 by maintainers)

Top GitHub Comments

1reaction

fmassacommented, Aug 21, 2020

@songhwanjun good catch. I had fixed this for the class AttentionVisualizer, but looks like I missed it in the example cell just above. I’ll send a PR fixing it

1reaction

fmassacommented, Jul 28, 2020

Hi,

This is a fair point and is discussed in the bottom of the notebook as well

In the end, it’s a matter of deciding if you want the attention of all pixels at location x, or the contribution of location x to all the attention maps.

I would say both information are useful to be considered, and as such I added both in the widget visualization.

Top Results From Across the Web

The Illustrated Transformer - Jay Alammar

The first step in calculating self-attention is to create three vectors from each of the encoder's input vectors (in this case, the embedding ......

Explaining the Translation Error Factors of Machine ...

We propose a method to explain translation error factors of machine translation algorithms by comparison the Self-Attention paths between ST(source text) and ST ......

Transformers Explained Visually (Part 3): Multi-head Attention ...

A Gentle Guide to the inner workings of Self-Attention, Encoder-Decoder Attention, Attention Score and Masking, in Plain English.

How Attention works in Deep Learning: understanding the ...

Here is an example of self-supervised approaches to videos: ... One way to visualize implicit attention is by looking at the partial ...

I Trying to visualize attention using image captioning model ...

Here is the code i used to visualize the attention of an image, ... I got error that saying the size tensor a...