flow is not handled correctly in k_hop_subgraph
See original GitHub issueI am not sure if this comes from my misunderstanding of the “flow” and “edge_index” definitions, or it is a bug. From what I have understood, I was thinking that since the edge_index is a matrix of [2, num_edges], the first row represents the source nodes (origin), and the second row represents the target nodes (destination). If this is the case, then to me flow==‘source_to_target’ means that we want to go from the first row of edge_index to the second row. In this case, when I was looking inside k_hop_subgraph function, I figured out that you are using:
if flow == 'target_to_source':
row, col = edge_index
else:
col, row = edge_index
Assuming “source_to_target”, which goes to the else part in the above-mentioned code, to me this means that row is the destination and col is the origin. If this is the case, then I think in the following code (inside k_hop_subgraph) we should exchange the row and col variable:
for _ in range(num_hops):
node_mask.fill_(False)
node_mask[subsets[-1]] = True
torch.index_select(node_mask, 0, row, out=edge_mask)
subsets.append(col[edge_mask])
I mean, the above-mentioned code must be changed to:
for _ in range(num_hops):
node_mask.fill_(False)
node_mask[subsets[-1]] = True
torch.index_select(node_mask, 0, **col**, out=edge_mask)
subsets.append(**row**[edge_mask])
Please accept my apology in advance if I am missing something. If I am missing something, it would be great if you can clarify the definition of edge_index and flow in the documentations using a simple example with a directed graph.
Issue Analytics
- State:
- Created 3 years ago
- Reactions:1
- Comments:8 (5 by maintainers)
Top GitHub Comments
Updated the doc accordingly, see https://pytorch-geometric.readthedocs.io/en/latest/modules/utils.html#torch_geometric.utils.k_hop_subgraph.
Yes, you are right. It is quite hard to understand and I am trying to improve this (at least in PyG 2.0). At the moment,
subgraph_k_hop
returns a subgraph that is suitable for message passing. Considerflow="source_to_target"
, then we want a subgraph where messages flow tonode_idx
. Hence,node_idx
should be considered as a target node.