Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Can not obtain deterministic results when using `dgl.nn.pytorch.GraphConv`, even the random seeds are fixed.

See original GitHub issue

🐛 Bug

In our experiment, we split the training/validation/test set for a graph data set (taking Cora as an example) many times. Each split was trained in a run and the classification accuracy was calculated. We found that when we use dgl.nn.pytorch.GraphConv to construct the GCN, although all random seeds (numpy, torch, dgl) are fixed, the experimental results of repeated experiments are still inconsistent.

The experiment code and data are in here

Run python main.py --use_dgl --dataset cora, the results are:

Repeat 0:
Run 0, test_acc: 0.8451
Run 1, test_acc: 0.8571
Run 2, test_acc: 0.8270
Run 3, test_acc: 0.8290
Run 4, test_acc: 0.8290
Repeat 1:
Run 0, test_acc: 0.8592
Run 1, test_acc: 0.8551
Run 2, test_acc: 0.8149
Run 3, test_acc: 0.8229
Run 4, test_acc: 0.8209

Run python main.py --use_dgl --dataset texas, the results are:

Repeat 0:
Run 0, test_acc: 0.0541
Run 1, test_acc: 0.3784
Run 2, test_acc: 0.4054
Run 3, test_acc: 0.5135
Run 4, test_acc: 0.4595
Repeat 1:
Run 0, test_acc: 0.4054
Run 1, test_acc: 0.3784
Run 2, test_acc: 0.4054
Run 3, test_acc: 0.5405
Run 4, test_acc: 0.4324

As shown, for the two datasets: Cora and Texas, the experimental results of two repeats are inconsistent.

We also build another GCN that doesn’t use dgl.nn.pytorch.GraphConv

Run python main.py --dataset cora, the results are:

Repeat 0:
Run 0, test_acc: 0.1650
Run 1, test_acc: 0.1288
Run 2, test_acc: 0.1429
Run 3, test_acc: 0.1469
Run 4, test_acc: 0.1368
Repeat 1:
Run 0, test_acc: 0.1650
Run 1, test_acc: 0.1288
Run 2, test_acc: 0.1429
Run 3, test_acc: 0.1469
Run 4, test_acc: 0.1368

Run python main.py --dataset texas, the results are:

Repeat 0:
Run 0, test_acc: 0.0541
Run 1, test_acc: 0.2973
Run 2, test_acc: 0.1351
Run 3, test_acc: 0.1081
Run 4, test_acc: 0.1081
Repeat 1:
Run 0, test_acc: 0.0541
Run 1, test_acc: 0.2973
Run 2, test_acc: 0.1351
Run 3, test_acc: 0.1081
Run 4, test_acc: 0.1081

As shown above, for GCN without A. The experimental results of the two repeats are consistent. The poor accuracy of the model is due to the lack of normalization and other operations in the model, which do not affect whether the deterministic results are generated.

Issue Analytics

State:
Created a year ago
Reactions:1
Comments:21 (3 by maintainers)

Top GitHub Comments

1reaction

jinfycommented, Oct 10, 2022

Reopened as more people are getting similar problems. @zxqaaaaa @jinfy Could you provide a runnable script for us to reproduce?

We released a demo to do link prediction task using dgl. The code are in here

1reaction

bzbzbz22commented, Jun 23, 2022

In addition, the problem also occurs when we use some functions in DGL, like dgl.function.u_mul_e(). The DGL we use is version 0.7.2. And I test in the 0.8.2 version, the problem also occurs.