Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

About the Duplex attention

See original GitHub issue

Hi, Thanks for sharing the code!

I have a few questions about Section 3.1.2. Duplex attention.

I am confused by the notation in the section. For example, in this section, “Y=(K^{P\times d}, V^{P\times d}), where the values store the content of the Y variables (e.g. the randomly sampled latents for the case of GAN)”. Does it mean that V^{P\times d} is sampled from the original variable Y? how to set the number of P in your code?
“keys track the centroids of the attention-based assignments from X to Y, which can be computed as K=a_b(Y, X)”, does it mean K is calculated by using the self-attention module but with (Y, X) as input? If so, how to understand “the keys track the centroid of the attention-based assignments from X to Y”? BTW, how to get the centroids?
For the update rule in duplex attention, what does the a() function mean? Does it denote a self-attention module like a_b() in Section 3.1.1, where X as query, K as keys, and V as values, if so, K is calculated from another self-attention module as mentioned in question 2, so the output of a_b(Y, X) will be treated as Keys, so the update rule contains two self-attention operations? is that right? Does it mean ’Duplex‘ attention?
But finally I find I may be wrong when I read the last paragraph in this section. As mentioned in this section, “to support bidirectional interaction between elements, we can chain two reciprocal simplex attentions from X to Y and from Y to X, obtaining the duplex attention” So, does it mean, first, we calculate the Y by using a simplex attention module u^a(Y, X), and then use this Y as input of u^d(X, Y) to update X? Does it mean the duplex attention module contains three self-attention operations?

Thanks a lot! 😃

Issue Analytics

State:
Created 2 years ago
Reactions:4
Comments:7 (3 by maintainers)

Top GitHub Comments

1reaction

AndrewChiyzcommented, Apr 27, 2021

Thanks a lot for your detailed reply! Now I understand the core idea of the duplex attention part.

Thank you! 😃

0reactions

doraradcommented, Feb 3, 2022

Hi all! @07hyx06 Yep that’s correct! We first find the centroids by casting attention over the image features (x) and then update the features based on the centroids (K) @nicolas-dufour that’s right the values are not iteratively updated, only the centroids and the image features! @subminu Thanks so much for pointing that out! I’ll update the paper with that fix!

Top Results From Across the Web

MRNN: A Multi-Resolution Neural Network with Duplex ... - arXiv

The empirical study shows that MRNN with the duplex attention is significantly superior to existing models used for ad-hoc retrieval on ...

Attention‐guided duplex adversarial U‐net for pancreatic ...

Purpose. Segmenting the organs from computed tomography (CT) images is crucial to early diagnosis and treatment. Pancreas segmentation is ...

MRNN: A Multi-Resolution Neural Network with Duplex Attention for ...

The empirical study shows that MRNN with the duplex attention is significantly superior to existing models used for ad-hoc retrieval on benchmark datasets ......

Attention-guided duplex adversarial U-net for pancreatic ...

Purpose: Segmenting the organs from computed tomography (CT) images is crucial to early diagnosis and treatment. Pancreas segmentation is ...

Summary of 'Generative Adversarial Transformers'

The authors of GANformer refer to each pair of self-attention and ... attention (one way only), and (2) Duplex attention (both ways).