Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

confuse about the attn_weight of the _call_dense method of GATConv layer

See original GitHub issue

I notice that mask = -10e9 * (1.0 - a) is used to downgrade those with limited coef, but, as you have set a = tf.linalg.set_diag(a, tf.ones(shape, a.dtype)), the coef-matrix’s diagnol is set to 1, which means the mask of the diagnol(self attn) is 0. Think about a pair coef is 0.9, which is high enough, its mask will be -10e-8. When use softmax latter, the 0.9 coef will be set to 0.

` def _call_dense(self, x, a): shape = tf.shape(a)[:-1] a = tf.linalg.set_diag(a, tf.zeros(shape, a.dtype)) a = tf.linalg.set_diag(a, tf.ones(shape, a.dtype)) x = tf.einsum(“…NI , IHO -> …NHO”, x, self.kernel) attn_for_self = tf.einsum(“…NHI , IHO -> …NHO”, x, self.attn_kernel_self) attn_for_neighs = tf.einsum( “…NHI , IHO -> …NHO”, x, self.attn_kernel_neighs ) attn_for_neighs = tf.einsum(“…ABC -> …CBA”, attn_for_neighs)

    attn_coef = attn_for_self + attn_for_neighs
    attn_coef = tf.nn.leaky_relu(attn_coef, alpha=0.2)

    mask = -10e9 * (1.0 - a)
    attn_coef += mask[..., None, :]
    attn_coef = tf.nn.softmax(attn_coef, axis=-1)
    attn_coef_drop = self.dropout(attn_coef)

    output = tf.einsum("...NHM , ...MHI -> ...NHI", attn_coef_drop, x)

    return output, attn_coef`

Issue Analytics

State:
Created 2 years ago
Comments:8 (4 by maintainers)

Top GitHub Comments

2reactions

LJingnancommented, Mar 25, 2021

I get it. Thank you very very much.

0reactions

danielegrattarolacommented, Mar 25, 2021

If you remove the mask altogether, then all the zeros in the adjacency matrix will be considered as edges.

If you are willing to code your own layer you can try to binarize the adjacency matrix only for computing the mask, so something like this:

mask = -10e9 * (1.0 - tf.cast(a > 0, tf.float32))
attn_coef += mask[..., None, :]
attn_coef = tf.nn.softmax(attn_coef, axis=-1)
attn_coef *= a

but that is a bit more expensive.

Top Results From Across the Web

GATConv — DGL 0.9.1post1 documentation

GATConv can be applied on homogeneous graph and unidirectional bipartite graph. If the layer is to be applied to a unidirectional bipartite graph, ......

Questions on the GAT conv layer · Issue #1851 - GitHub

Questions & Help I have a few questions from a newbie in PyTorch Geometric regarding the GAT model: 1/ In the forward method, ......

torch_geometric.nn — pytorch_geometric documentation

paper, which fixes the static attention problem of the standard GATConv layer. TransformerConv. The graph transformer operator from the "Masked Label ...

Understanding Graph Attention Networks (GAT)

GAT (Graph Attention Network), is a novel neural network architecture that operate on graph-structured data, leveraging masked self-attentional ...