The sequence of operations in the Linformer Attention module is probably wrong.
See original GitHub issue🐛 Bug
Hi guys! In Linformer’s example source code, I found that operation order may not match the official paper mathematics.
Here, in the code, the linear attention is done in the following sequence of the two operations:
- a Linear projection from
n
token’s representations tok.
- a Linear projection over the embedding dimension (
d_m
tod_k
). as here (#208 and #213 respectively)
On the contrary, this image from the Linformer paper states that it should be performed in the order of:
- a Linear projection over the embedding dimension (
d_m
tod_k
). - a Linear projection from
n
token’s representations tok
. As seen in the picture below:
Am I missing something important here? If anything gets confirmed, I am up for fixing it.
Environment
Current fairseq version.
Issue Analytics
- State:
- Created 3 years ago
- Reactions:2
- Comments:5 (1 by maintainers)
Top Results From Across the Web
The sequence of operations in the Linformer Attention module ...
Bug Hi guys! In Linformer's example source code, I found that operation order may not match the official paper mathematics.
Read more >Linformer: Self-Attention with Linear Complexity - arXiv
In this paper, we demonstrate that the self-attention mechanism can be approximated by a low-rank matrix. We further exploit this finding to ...
Read more >Rethinking Attention with Performers (Paper Explained)
ai #research #attentionTransformers have huge memory and compute requirements because they construct an Attention matrix, ...
Read more >Self-Attention with Linear Complexity (Paper Explained)
In this paper, we demonstrate that the self- attention mechanism can be ... The resulting linear transformer, the \textit{ Linformer }, ...
Read more >Sketching as a Tool for Understanding and Accelerating Self ...
Transformer-based models are not efficient in processing long sequences due to the quadratic space and time complexity of the self-attention.
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Any update on this?
@madian9