confuse about the attn_weight of the _call_dense method of GATConv layer
See original GitHub issueI notice that mask = -10e9 * (1.0 - a)
is used to downgrade those with limited coef, but, as you have set
a = tf.linalg.set_diag(a, tf.ones(shape, a.dtype))
, the coef-matrix’s diagnol is set to 1, which means
the mask of the diagnol(self attn) is 0. Think about a pair coef is 0.9, which is high enough, its mask will be
-10e-8. When use softmax latter, the 0.9 coef will be set to 0.
` def _call_dense(self, x, a): shape = tf.shape(a)[:-1] a = tf.linalg.set_diag(a, tf.zeros(shape, a.dtype)) a = tf.linalg.set_diag(a, tf.ones(shape, a.dtype)) x = tf.einsum(“…NI , IHO -> …NHO”, x, self.kernel) attn_for_self = tf.einsum(“…NHI , IHO -> …NHO”, x, self.attn_kernel_self) attn_for_neighs = tf.einsum( “…NHI , IHO -> …NHO”, x, self.attn_kernel_neighs ) attn_for_neighs = tf.einsum(“…ABC -> …CBA”, attn_for_neighs)
attn_coef = attn_for_self + attn_for_neighs
attn_coef = tf.nn.leaky_relu(attn_coef, alpha=0.2)
mask = -10e9 * (1.0 - a)
attn_coef += mask[..., None, :]
attn_coef = tf.nn.softmax(attn_coef, axis=-1)
attn_coef_drop = self.dropout(attn_coef)
output = tf.einsum("...NHM , ...MHI -> ...NHI", attn_coef_drop, x)
return output, attn_coef`
Issue Analytics
- State:
- Created 2 years ago
- Comments:8 (4 by maintainers)
Top GitHub Comments
I get it. Thank you very very much.
If you remove the mask altogether, then all the zeros in the adjacency matrix will be considered as edges.
If you are willing to code your own layer you can try to binarize the adjacency matrix only for computing the mask, so something like this:
but that is a bit more expensive.