Hetero R-GCN not matching OGB MAG Leaderboard performance
See original GitHub issue🐛 Bug
In an effort to directly reproduce the “NeighborSampling (R-GCN aggr)” results using DGL, I get test performance of 40.17 +/- 0.62
, compared to the PyG code linked on the leaderboard that gets 46.85 ± 0.48
To Reproduce
Steps to reproduce the behavior:
- I adapted the example in the dgl examples directory:
examples/pytorch/rgcn-hetero/entity_classify_mb.py
. My implementation can be found in this repo. Run withpython rgcn_hetero_dgl.py
Expected behavior
I made an effort to exactly duplicate the PyG model architecture and parameter choices and I expect to get comparable performance.
Environment
Training on single GPU.
- DGL 0.7.0 (pip)
- PyTorch 1.9.0+cu111
- Python 3.6.9
- Linux OS
Issue Analytics
- State:
- Created 2 years ago
- Comments:12 (11 by maintainers)
Top Results From Across the Web
Hetero_rgcn's results on ogbn-mag can not be reproduced
Hi everyone, I'm working on the node classification for ogbn-mag dataset, and I'm following the instructions here: dgl/examples/pytorch/ogb/ogbn-mag at ...
Read more >DGI: Easy and Efficient Inference for GNNs - arXiv
First, for a layer, target nodes and input embedding may not fit in GPU memory, which requires to organize the target nodes into...
Read more >Revisiting, benchmarking and refining heterogeneous graph ...
Then, a specific homogeneous GNN, such as Graph Attention Network (GAT) [32], is applied and it can achieve comparable performance with the ...
Read more >What is the Open Graph Benchmark (OGB)? - SigOpt
Edges in the graph represent co-purchasing of products. For OGBN-mag, the OGBN-mag dataset is a heterograph – meaning it has multiple types of ......
Read more >dgl-cu90: Versions - Openbase
Performance fix of graph batching: #2363; Speedup on readout: #2361 ... We tested our RGCN implementation on OGB-MAG which achieved 0.4622 accuracy.
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Another update: I’m now able to slightly beat PyG. The only change was in capturing performance statistics in the same way. Previously I was reporting performance at the end of the 3 epochs, but they actually report train/test performance of the epoch in which the validation accuracy was maximized (code).
However, there still remains a discrepancy: the training accuracy is still significantly higher for DGL and it appears to overfit with fewer epochs. This may be an artifact of differences in initialization methods. I’ll investigate this next.
Hi @zjost , thanks for the great efforts. It will be greatly appreciated if you can submit a PR and add an example in this folder for ogbn-mag.