error in self attention visualisation
See original GitHub issueI beleive there is a error for idexing the self attetion:
for idx_o, ax in zip(idxs, axs):
idx = (idx_o[0] // fact, idx_o[1] // fact)
ax.imshow(sattn[..., idx[0], idx[1]], cmap='cividis', interpolation='nearest')
ax.axis('off')
ax.set_title(f'self-attention{idx_o}')
should be
sattn[idx[0], idx[1], ...]
as the last two dim sums to 1, so the first two are locations and are the dim to indexing, with
this the attention map is
Issue Analytics
- State:
- Created 3 years ago
- Comments:9 (3 by maintainers)
Top Results From Across the Web
The Illustrated Transformer - Jay Alammar
The first step in calculating self-attention is to create three vectors from each of the encoder's input vectors (in this case, the embedding ......
Read more >Explaining the Translation Error Factors of Machine ...
We propose a method to explain translation error factors of machine translation algorithms by comparison the Self-Attention paths between ST(source text) and ST ......
Read more >Transformers Explained Visually (Part 3): Multi-head Attention ...
A Gentle Guide to the inner workings of Self-Attention, Encoder-Decoder Attention, Attention Score and Masking, in Plain English.
Read more >How Attention works in Deep Learning: understanding the ...
Here is an example of self-supervised approaches to videos: ... One way to visualize implicit attention is by looking at the partial ...
Read more >I Trying to visualize attention using image captioning model ...
Here is the code i used to visualize the attention of an image, ... I got error that saying the size tensor a...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
@songhwanjun good catch. I had fixed this for the class
AttentionVisualizer
, but looks like I missed it in the example cell just above. I’ll send a PR fixing itHi,
This is a fair point and is discussed in the bottom of the notebook as well
In the end, it’s a matter of deciding if you want the attention of all pixels at location x, or the contribution of location x to all the attention maps.
I would say both information are useful to be considered, and as such I added both in the widget visualization.