question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

confuse about the attn_weight of the _call_dense method of GATConv layer

See original GitHub issue

I notice that mask = -10e9 * (1.0 - a) is used to downgrade those with limited coef, but, as you have set a = tf.linalg.set_diag(a, tf.ones(shape, a.dtype)), the coef-matrix’s diagnol is set to 1, which means the mask of the diagnol(self attn) is 0. Think about a pair coef is 0.9, which is high enough, its mask will be -10e-8. When use softmax latter, the 0.9 coef will be set to 0.

` def _call_dense(self, x, a): shape = tf.shape(a)[:-1] a = tf.linalg.set_diag(a, tf.zeros(shape, a.dtype)) a = tf.linalg.set_diag(a, tf.ones(shape, a.dtype)) x = tf.einsum(“…NI , IHO -> …NHO”, x, self.kernel) attn_for_self = tf.einsum(“…NHI , IHO -> …NHO”, x, self.attn_kernel_self) attn_for_neighs = tf.einsum( “…NHI , IHO -> …NHO”, x, self.attn_kernel_neighs ) attn_for_neighs = tf.einsum(“…ABC -> …CBA”, attn_for_neighs)

    attn_coef = attn_for_self + attn_for_neighs
    attn_coef = tf.nn.leaky_relu(attn_coef, alpha=0.2)

    mask = -10e9 * (1.0 - a)
    attn_coef += mask[..., None, :]
    attn_coef = tf.nn.softmax(attn_coef, axis=-1)
    attn_coef_drop = self.dropout(attn_coef)

    output = tf.einsum("...NHM , ...MHI -> ...NHI", attn_coef_drop, x)

    return output, attn_coef`

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:8 (4 by maintainers)

github_iconTop GitHub Comments

2reactions
LJingnancommented, Mar 25, 2021

I get it. Thank you very very much.

0reactions
danielegrattarolacommented, Mar 25, 2021

If you remove the mask altogether, then all the zeros in the adjacency matrix will be considered as edges.

If you are willing to code your own layer you can try to binarize the adjacency matrix only for computing the mask, so something like this:

mask = -10e9 * (1.0 - tf.cast(a > 0, tf.float32))
attn_coef += mask[..., None, :]
attn_coef = tf.nn.softmax(attn_coef, axis=-1)
attn_coef *= a

but that is a bit more expensive.

Read more comments on GitHub >

github_iconTop Results From Across the Web

GATConv — DGL 0.9.1post1 documentation
GATConv can be applied on homogeneous graph and unidirectional bipartite graph. If the layer is to be applied to a unidirectional bipartite graph, ......
Read more >
Questions on the GAT conv layer · Issue #1851 - GitHub
Questions & Help I have a few questions from a newbie in PyTorch Geometric regarding the GAT model: 1/ In the forward method, ......
Read more >
torch_geometric.nn — pytorch_geometric documentation
paper, which fixes the static attention problem of the standard GATConv layer. TransformerConv. The graph transformer operator from the "Masked Label ...
Read more >
Understanding Graph Attention Networks (GAT)
GAT (Graph Attention Network), is a novel neural network architecture that operate on graph-structured data, leveraging masked self-attentional ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found