question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

implementaion about knowledge graph attention

See original GitHub issue

From now on, we recommend using our discussion forum (https://github.com/rusty1s/pytorch_geometric/discussions) for general questions.

❓ Questions & Help

Hi, I’d like to implement the function as follows: image image That is, I need to calculate the attention alpha between the head entity and relations, and need to calculate the attention belta between relation and tail entity. So, How can I use pyg to implement attention score in terms of a part of neighbors, not all neighbors? Note: I can implement it in the numerator, but it’s not clear how do I compute the denominator image image The above is the explanation of the Dimension of the tensor in the softmax function where num_edge is the number of all edge_index. Looking forward to your help! Thanks a lot!

Issue Analytics

  • State:open
  • Created 2 years ago
  • Comments:10 (3 by maintainers)

github_iconTop GitHub Comments

2reactions
junkangwucommented, Apr 14, 2021

Thank a lot for your patience. I have tried a trick and address it. The main idea is to control the index in order to using softmax function. We just need to prepare some index in advance which will not affect the training speed. And I made a custom adjustment to your softmax function:

  from torch_scatter import scatter
  def softmax(src, index, N, index2=None, ptr=None):
      if ptr is not None:
          src_max = gather_csr(segment_csr(src, ptr, reduce='max'), ptr)
          out = (src - src_max).exp()
          out_sum = gather_csr(segment_csr(out, ptr, reduce='sum'), ptr)
  
      elif index2 is not None: # as for the repeated (head, relation) which are thrown out, we need to selece them out
          src_max = scatter(src, index, dim_size=N, reduce='max')[index2]
          out = (src - src_max).exp()
          out_sum = scatter(out, index, dim_size=N, reduce='sum')[index2]
  
      elif index is not None:
          src_max = scatter(src, index, dim_size=N, reduce='max')[index]
          out = (src - src_max).exp()
          out_sum = scatter(out, index, dim_size=N, reduce='sum')[index]
      else:
          raise NotImplementedError
    return out / (out_sum + 1e-16)

The detailed attention computing procedure as follows:

import torch
import torch. nn as nn
edge_index = torch.LongTensor([[0, 0, 0, 0, 0, 0, 2],
        [1, 4, 2, 3, 5, 6, 3]])
# edge_index = torch.LongTensor([[0, 1, 1, 2, 2, 1, 0],
#         [1, 4, 2, 3, 5, 6, 3]])
edge_type = torch.LongTensor([0, 0, 1, 2, 2, 2, 3])
r = torch.randn(4, 4) # embedding for relation_type
h = torch.randn(7, 4) # embedding for entity type
weight = torch.randn(8, 1) 
dict_rt = {}
index_rt = []
cnt = 0
# prepare for beta
for i in range(edge_index.size(1)):
    if (edge_index[0, i].item(), edge_type[i].item()) not in dict_rt:
        index_rt.append(cnt)
        dict_rt[(edge_index[0, i].item(), edge_type[i].item())] = cnt
        cnt += 1
    else:
        index_rt.append(dict_rt[(edge_index[0, i].item(), edge_type[i].item())]) # index should be same under the same (head, relation) in order to compute the sum attention under the  (head, relation)
beta = torch.cat([h[edge_index[1]], r[edge_type]], dim=1) @ weight # attention between tail and relation [num_edge 1]
beta = nn.LeakyReLU(0.2)(beta)
print('beta before')
print(beta)
beta = softmax(beta, torch.LongTensor(index_rt), 7)
print('beta after')
print(beta)
# prepare for alpha
index_hr = []
dict_hr = {}
cnt = 0
for i in range(edge_index.size(1)):
    if (edge_index[0, i].item(), edge_type[i].item()) not in dict_hr:
        index_hr.append(edge_index[0, i]) # index should be same under the same (head) in order to compute the sum attention under the (head)
        dict_hr[(edge_index[0, i].item(), edge_type[i].item())] = cnt
        cnt += 1
    else:
        index_hr.append(edge_index.size(1)-1) # all the repeated (head, relation) should be thrown out avoid of double counting by giving a maxinmum index value
alpha = torch.cat([h[edge_index[0]], r[edge_type]], dim=1) @ weight # attention between head and relation [num_edge 1]
alpha = nn.LeakyReLU(0.2)(alpha)
print('alpha before')
print(alpha)
alpha = softmax(alpha, torch.LongTensor(index_hr), 7, edge_index[0])
print('alpha after')
print(alpha)

Focusing on the example in the figure, the final result is :

beta before
tensor([[-0.4961],
        [-0.3258],
        [ 2.4938],
        [ 3.6552],
        [ 3.0433],
        [ 2.1643],
        [ 3.5956]])
beta after
tensor([[0.4575],
        [0.5425],
        [1.0000],
        [0.5658],
        [0.3068],
        [0.1274],
        [1.0000]])
alpha before
tensor([[ 0.0215],
        [ 0.0215],
        [-0.3617],
        [-0.0067],
        [-0.0067],
        [-0.0067],
        [ 4.2091]])
alpha after
tensor([[0.3768],
        [0.3768],
        [0.2569],
        [0.3663],
        [0.3663],
        [0.3663],
        [1.0000]])

Overall, Thanks a lot for your patience with sharing!

0reactions
junkangwucommented, Jun 20, 2022

Thank a lot for your patience. I have tried a trick and address it. The main idea is to control the index in order to using softmax function. We just need to prepare some index in advance which will not affect the training speed. And I made a custom adjustment to your softmax function:

  from torch_scatter import scatter
  def softmax(src, index, N, index2=None, ptr=None):
      if ptr is not None:
          src_max = gather_csr(segment_csr(src, ptr, reduce='max'), ptr)
          out = (src - src_max).exp()
          out_sum = gather_csr(segment_csr(out, ptr, reduce='sum'), ptr)
  
      elif index2 is not None: # as for the repeated (head, relation) which are thrown out, we need to selece them out
          src_max = scatter(src, index, dim_size=N, reduce='max')[index2]
          out = (src - src_max).exp()
          out_sum = scatter(out, index, dim_size=N, reduce='sum')[index2]
  
      elif index is not None:
          src_max = scatter(src, index, dim_size=N, reduce='max')[index]
          out = (src - src_max).exp()
          out_sum = scatter(out, index, dim_size=N, reduce='sum')[index]
      else:
          raise NotImplementedError
    return out / (out_sum + 1e-16)

The detailed attention computing procedure as follows:

import torch
import torch. nn as nn
edge_index = torch.LongTensor([[0, 0, 0, 0, 0, 0, 2],
        [1, 4, 2, 3, 5, 6, 3]])
# edge_index = torch.LongTensor([[0, 1, 1, 2, 2, 1, 0],
#         [1, 4, 2, 3, 5, 6, 3]])
edge_type = torch.LongTensor([0, 0, 1, 2, 2, 2, 3])
r = torch.randn(4, 4) # embedding for relation_type
h = torch.randn(7, 4) # embedding for entity type
weight = torch.randn(8, 1) 
dict_rt = {}
index_rt = []
cnt = 0
# prepare for beta
for i in range(edge_index.size(1)):
    if (edge_index[0, i].item(), edge_type[i].item()) not in dict_rt:
        index_rt.append(cnt)
        dict_rt[(edge_index[0, i].item(), edge_type[i].item())] = cnt
        cnt += 1
    else:
        index_rt.append(dict_rt[(edge_index[0, i].item(), edge_type[i].item())]) # index should be same under the same (head, relation) in order to compute the sum attention under the  (head, relation)
beta = torch.cat([h[edge_index[1]], r[edge_type]], dim=1) @ weight # attention between tail and relation [num_edge 1]
beta = nn.LeakyReLU(0.2)(beta)
print('beta before')
print(beta)
beta = softmax(beta, torch.LongTensor(index_rt), 7)
print('beta after')
print(beta)
# prepare for alpha
index_hr = []
dict_hr = {}
cnt = 0
for i in range(edge_index.size(1)):
    if (edge_index[0, i].item(), edge_type[i].item()) not in dict_hr:
        index_hr.append(edge_index[0, i]) # index should be same under the same (head) in order to compute the sum attention under the (head)
        dict_hr[(edge_index[0, i].item(), edge_type[i].item())] = cnt
        cnt += 1
    else:
        index_hr.append(edge_index.size(1)-1) # all the repeated (head, relation) should be thrown out avoid of double counting by giving a maxinmum index value
alpha = torch.cat([h[edge_index[0]], r[edge_type]], dim=1) @ weight # attention between head and relation [num_edge 1]
alpha = nn.LeakyReLU(0.2)(alpha)
print('alpha before')
print(alpha)
alpha = softmax(alpha, torch.LongTensor(index_hr), 7, edge_index[0])
print('alpha after')
print(alpha)

Focusing on the example in the figure, the final result is :

beta before
tensor([[-0.4961],
        [-0.3258],
        [ 2.4938],
        [ 3.6552],
        [ 3.0433],
        [ 2.1643],
        [ 3.5956]])
beta after
tensor([[0.4575],
        [0.5425],
        [1.0000],
        [0.5658],
        [0.3068],
        [0.1274],
        [1.0000]])
alpha before
tensor([[ 0.0215],
        [ 0.0215],
        [-0.3617],
        [-0.0067],
        [-0.0067],
        [-0.0067],
        [ 4.2091]])
alpha after
tensor([[0.3768],
        [0.3768],
        [0.2569],
        [0.3663],
        [0.3663],
        [0.3663],
        [1.0000]])

Overall, Thanks a lot for your patience with sharing!

Hello, I want to reproduce RGHAT. Do you reproduce the result successfully? I think that computing alpha and beta at the same time based on PyG is still hard.

Sorry to reply so late. However, I failed to reproduce it…

Read more comments on GitHub >

github_iconTop Results From Across the Web

KGAnet: a knowledge graph attention network for enhancing ...
In this paper, we propose a novel joint training framework that consists of a modified graph attention network, called the knowledge graph ......
Read more >
DisenKGAT: Knowledge Graph Embedding with Disentangled ...
In this work, we propose a novel Disentangled Knowledge Graph Attention Network (DisenKGAT) for KGC, which leverages both ...
Read more >
KGAT: Knowledge Graph Attention Network for ... - GitHub
Knowledge Graph Attention Network (KGAT) is a new recommendation framework tailored to knowledge-aware personalized recommendation. Built upon the graph neural ...
Read more >
MRGAT: Multi-Relational Graph Attention Network for ...
Graph neural networks (GNNs) are popular and promising embedding models which can exploit and use the structural information of neighbors in knowledge graphs....
Read more >
Time-aware Relational Graph Attention Network for Temporal ...
Embedding-based representation learning approaches for knowledge graphs (KGs) have been mostly designed for static data. However, many KGs involve temporal ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found