question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Can not obtain deterministic results when using `dgl.nn.pytorch.GraphConv`, even the random seeds are fixed.

See original GitHub issue

🐛 Bug

In our experiment, we split the training/validation/test set for a graph data set (taking Cora as an example) many times. Each split was trained in a run and the classification accuracy was calculated. We found that when we use dgl.nn.pytorch.GraphConv to construct the GCN, although all random seeds (numpy, torch, dgl) are fixed, the experimental results of repeated experiments are still inconsistent.

The experiment code and data are in here

Run python main.py --use_dgl --dataset cora, the results are:

Repeat 0:
Run 0, test_acc: 0.8451
Run 1, test_acc: 0.8571
Run 2, test_acc: 0.8270
Run 3, test_acc: 0.8290
Run 4, test_acc: 0.8290
Repeat 1:
Run 0, test_acc: 0.8592
Run 1, test_acc: 0.8551
Run 2, test_acc: 0.8149
Run 3, test_acc: 0.8229
Run 4, test_acc: 0.8209

Run python main.py --use_dgl --dataset texas, the results are:

Repeat 0:
Run 0, test_acc: 0.0541
Run 1, test_acc: 0.3784
Run 2, test_acc: 0.4054
Run 3, test_acc: 0.5135
Run 4, test_acc: 0.4595
Repeat 1:
Run 0, test_acc: 0.4054
Run 1, test_acc: 0.3784
Run 2, test_acc: 0.4054
Run 3, test_acc: 0.5405
Run 4, test_acc: 0.4324

As shown, for the two datasets: Cora and Texas, the experimental results of two repeats are inconsistent.

We also build another GCN that doesn’t use dgl.nn.pytorch.GraphConv

Run python main.py --dataset cora, the results are:

Repeat 0:
Run 0, test_acc: 0.1650
Run 1, test_acc: 0.1288
Run 2, test_acc: 0.1429
Run 3, test_acc: 0.1469
Run 4, test_acc: 0.1368
Repeat 1:
Run 0, test_acc: 0.1650
Run 1, test_acc: 0.1288
Run 2, test_acc: 0.1429
Run 3, test_acc: 0.1469
Run 4, test_acc: 0.1368

Run python main.py --dataset texas, the results are:

Repeat 0:
Run 0, test_acc: 0.0541
Run 1, test_acc: 0.2973
Run 2, test_acc: 0.1351
Run 3, test_acc: 0.1081
Run 4, test_acc: 0.1081
Repeat 1:
Run 0, test_acc: 0.0541
Run 1, test_acc: 0.2973
Run 2, test_acc: 0.1351
Run 3, test_acc: 0.1081
Run 4, test_acc: 0.1081

As shown above, for GCN without A. The experimental results of the two repeats are consistent. The poor accuracy of the model is due to the lack of normalization and other operations in the model, which do not affect whether the deterministic results are generated.

Issue Analytics

  • State:open
  • Created a year ago
  • Reactions:1
  • Comments:21 (3 by maintainers)

github_iconTop GitHub Comments

1reaction
jinfycommented, Oct 10, 2022

Reopened as more people are getting similar problems. @zxqaaaaa @jinfy Could you provide a runnable script for us to reproduce?

We released a demo to do link prediction task using dgl. The code are in here

1reaction
bzbzbz22commented, Jun 23, 2022

In addition, the problem also occurs when we use some functions in DGL, like dgl.function.u_mul_e(). The DGL we use is version 0.7.2. And I test in the 0.8.2 version, the problem also occurs.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Frequently Asked Questions (FAQ) - Deep Graph Library
How to get deterministic training results? You need to fix the random seed of Python, NumPy, backend framework (e.g., PyTorch), and DGL (with ......
Read more >
Random State, Manual Seed & Oscillating Loss
Hello, I have a simple model with three GraphConv layers (DGL) and one nn.Linear layer, trying to do predictions over a graph with...
Read more >
Nondeterminism even when setting all seeds, 0 workers, and ...
I am using Python 3.6.3, PyTorch 0.4.1, and a single NVIDIA Tesla K80 GPU, on CentOS Linux 7.4.1708. import numpy as np import...
Read more >
Unable to obtain deterministic training - vision - PyTorch Forums
Here's what I set in the beginning of the main training entry script: random.seed(123) torch.manual_seed(123) np.random.seed(123) …
Read more >
arXiv:2107.01188v2 [cs.LG] 22 Apr 2022
Here we demonstrate how graph neural networks can be used to solve combinatorial optimization problems. Our approach is broadly applicable to ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found