Replacing Attention module of Vision Transformer with SelfAttention Module of Performer?
See original GitHub issueHey, thanks for your great work I love it! 😃 A quick question - in your repo for the Vision Transformer [https://github.com/lucidrains/vit-pytorch] there is a module called Attention
. Can I simply use the Vision Transformer and replace the Attention module with the SelfAttention module from the Performer?
Issue Analytics
- State:
- Created 3 years ago
- Comments:6 (3 by maintainers)
Top Results From Across the Web
ViT-LSLA: Vision Transformer with Light Self-Limited-Attention
Firstly, the LSA replaces the K (Key) and V (Value) of self-attention with the X(origin input). Applying it in vision Transformers which ...
Read more >Microsoft AI Proposes 'FocalNets' Where Self-Attention is ...
Microsoft AI Proposes 'FocalNets' Where Self-Attention is Completely Replaced by a Focal Modulation Module, Enabling To Build New Computer ...
Read more >Vision Transformer With Deformable Attention
propose a novel deformable self-attention module, where ... Comparison of DAT with other Vision Transformer mod- ... performance improvements.
Read more >Pay Less Attention in Vision Transformers
end, we present a novel Less attention vIsion Transformer. (LIT), building upon the fact that ... complexity of the self-attention module. Targeting at...
Read more >Attention is All you Need - NIPS papers
The best performing models also connect the encoder and decoder through an attention mechanism. We propose a new simple network architecture, the Transformer,....
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
@lucidrains I recently used your implementation of performer (https://github.com/microsoft/vision-longformer/blob/main/src/models/layers/performer.py) of linformer (https://github.com/microsoft/vision-longformer/blob/main/src/models/layers/linformer.py) to compare different efficient attention mechanisms in image classification and object detection tasks. See the results reported here: https://github.com/microsoft/vision-longformer. Thank you for your excellent open-sourced code!
@PascalHbr @NZ42 You may be interested in the results, too.
Thank you for the quick reply. In all honesty I’m interested in substituting the self-attention of vision transformers with FAVOR. I see that in your other repo you use the Linformer. Do you have any tips about how to best approach this? I’m also looking into substituting it in pretrained models from timm.