question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Allow HeteroGraphConv to be applied on a subset of the etypes in a graph.

See original GitHub issue

🐛 Bug

Right now, applying HeteroGraphConv to a graph while only having created modules for a subset of the graphs etypes causes a KeyError to be raised. For the torch model the corresponding lines are here: https://github.com/dmlc/dgl/blob/5be937a7fbfca0db08a1507744906d61e47a340b/python/dgl/nn/pytorch/hetero.py#L176 and line 189 below. To me that seems like an unnecessary restriction, making experimenting with heterogeneous graphs unnecessarily complicated if you want to ignore certain edges for whatever reason.

You can work around it by calling: graph.remove_edges(graph.edge_ids(*graph.edges(etype="ignore_me"), etype="ignore_me"), etype="ignore_me") ~If you attempt this workaround please not that remove_edges is a bit of a troublemaker on batched graphs as of writing this. See: https://github.com/dmlc/dgl/issues/2310#issuecomment-858456712~

To Reproduce

Steps to reproduce the behavior:

  1. Create a heterograph with at least two etypes.
  2. Create a HeteroGraphConv module with modules for only a subset of the etypes.
  3. Try to apply the module -> 💥

Expected behavior

As far as I understand the HeteroGraphConv should just ignore the edges for which no module exists. Speaking in terms of code, adding:

if etype not in self.mods:
   continue

below this line and this line should fix it in the torch implementation and the code for the other implementations looks reasonably similar.

Environment

  • DGL Version (e.g., 1.0): 0.6.1
  • Backend Library & Version (e.g., PyTorch 0.4.1, MXNet/Gluon 1.3): Any
  • OS (e.g., Linux): Linux
  • How you installed DGL (conda, pip, source): pip
  • Python version: 3.8

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:8 (8 by maintainers)

github_iconTop GitHub Comments

1reaction
yunshiuancommented, Aug 11, 2021

@NiklasBeierl Thank you for your detailed response! That clarifies my question about the problems with your fix. Regarding my troubles with edge_type_subgraph(), the solution you proposed works if I am feeding the entire graph into the GNN layers.

However, if I want to train the GNN with mini-batches using EdgeDataLoader (https://docs.dgl.ai/en/0.6.x/api/python/dgl.dataloading.html#dgl.dataloading.pytorch.EdgeDataLoader), your solution would not work. Note that EdgeDataLoader returns an iterator of input_nodes, pos_pair_graph, neg_pair_graph, blocks. For pos_pair_graph, it should include type A edges because they’re the links I want to predict. But for blocks (the message flow graphs (MFGs)), it should NOT include type A edges because I want to exclude them from message passaging. Note that it’s not possible to apply edge_type_subgraph() on MFGs, so there’s no way I can get rid of type A edges from blocks when feeding them into the GNN.

0reactions
NiklasBeierlcommented, Aug 11, 2021

Hey @yunshiuan,

I think the edge case that @jermainewang is describing there is this: Imagine a node in the graph is only connected to the rest of the graph with an edge of one specific type. Here node 3 only has one incoming edge, which is of type B.

>>> import dgl
Using backend: pytorch
>>> node_data = {
... ("N", "A", "N"): ([1,1], [0,2]),
... ("N", "B","N"): ([2,2],[0,3])
... }
>>> graph = dgl.heterograph(node_data)

Now, if we ignore the B edges, Node 3 becomes a Node with degree 0. (It has no neighbors). If you now take a look at many of the Graph convolution modules in dgl.nn you will see that this is a problem for a lot of them. A search for zero_in_degree on this page will give you a good impression. It usually ends up being a divide-by-0 situation. c_ij in the GraphConv formula is a good example.

Now to your troubles with edge_type_subgraph(): I am not sure what you mean by the edge you want to predict being “incorrectly” removed. You mean that you need to keep it to later calculate your loss? I haven’t done any link prediction yet. But shouldn’t you be able to feed a copy of your graph without type A edges (obtained from edge_type_subgraph()) into the module while keeping your original graph stored in another variable to compare?

>>> without_a= graph.edge_type_subgraph(["B"]) # Use this for training
>>> without_a
Graph(num_nodes=4, num_edges=2,
      ndata_schemes={}
      edata_schemes={})
>>> graph # Stays the same, still has `B` edges in it
Graph(num_nodes={'N': 4},
      num_edges={('N', 'A', 'N'): 2, ('N', 'B', 'N'): 2},
      metagraph=[('N', 'N', 'A'), ('N', 'N', 'B')])

Now perhaps you still need to deal with 0-degree nodes. (edge_type_subgraph does not take care of that) You can allow them if your GConv module has a allow_zero_in_degree argument, or remove them like so.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Allow HeteroGraphConv to be applied on a subset of the ...
Right now, applying HeteroGraphConv to a graph while only having created modules for a subset of the graphs etypes causes a KeyError to...
Read more >
HeteroGraphConv — DGL 0.10 documentation - DGL Docs
HeteroGraphConv requires that there is a module for every 'etype' in an input graph. If you want to apply HeteroGraphConv to a subset...
Read more >
Platforms and Practice of Heterogeneous Graph ... - Chuan Shi
Abstract It is challenging to build a Heterogeneous Graph (HG) representation learning model because HG is heterogeneous, irregular, and sparse.
Read more >
Proper and Improper Subgraphs | Graph Theory - YouTube
What are proper and improper subgraphs? We'll go over definitions and examples in today's video graph theory lesson.Given a graph G, ...
Read more >
Layer — OpenHGNN documentation - Read the Docs
HeteroGraphConv. A generic module for computing convolution on heterogeneous graphs. ATTConv ... Apply Linear, BatchNorm1d, Dropout and normalize(if need).
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found