Can not obtain deterministic results when using `dgl.nn.pytorch.GraphConv`, even the random seeds are fixed.
See original GitHub issue🐛 Bug
In our experiment, we split the training/validation/test set for a graph data set (taking Cora as an example) many times. Each split was trained in a run and the classification accuracy was calculated.
We found that when we use dgl.nn.pytorch.GraphConv
to construct the GCN, although all random seeds (numpy, torch, dgl) are fixed, the experimental results of repeated experiments are still inconsistent.
The experiment code and data are in here
Run python main.py --use_dgl --dataset cora
,
the results are:
Repeat 0:
Run 0, test_acc: 0.8451
Run 1, test_acc: 0.8571
Run 2, test_acc: 0.8270
Run 3, test_acc: 0.8290
Run 4, test_acc: 0.8290
Repeat 1:
Run 0, test_acc: 0.8592
Run 1, test_acc: 0.8551
Run 2, test_acc: 0.8149
Run 3, test_acc: 0.8229
Run 4, test_acc: 0.8209
Run python main.py --use_dgl --dataset texas
,
the results are:
Repeat 0:
Run 0, test_acc: 0.0541
Run 1, test_acc: 0.3784
Run 2, test_acc: 0.4054
Run 3, test_acc: 0.5135
Run 4, test_acc: 0.4595
Repeat 1:
Run 0, test_acc: 0.4054
Run 1, test_acc: 0.3784
Run 2, test_acc: 0.4054
Run 3, test_acc: 0.5405
Run 4, test_acc: 0.4324
As shown, for the two datasets: Cora and Texas, the experimental results of two repeats are inconsistent.
We also build another GCN that doesn’t use dgl.nn.pytorch.GraphConv
Run python main.py --dataset cora
, the results are:
Repeat 0:
Run 0, test_acc: 0.1650
Run 1, test_acc: 0.1288
Run 2, test_acc: 0.1429
Run 3, test_acc: 0.1469
Run 4, test_acc: 0.1368
Repeat 1:
Run 0, test_acc: 0.1650
Run 1, test_acc: 0.1288
Run 2, test_acc: 0.1429
Run 3, test_acc: 0.1469
Run 4, test_acc: 0.1368
Run python main.py --dataset texas
, the results are:
Repeat 0:
Run 0, test_acc: 0.0541
Run 1, test_acc: 0.2973
Run 2, test_acc: 0.1351
Run 3, test_acc: 0.1081
Run 4, test_acc: 0.1081
Repeat 1:
Run 0, test_acc: 0.0541
Run 1, test_acc: 0.2973
Run 2, test_acc: 0.1351
Run 3, test_acc: 0.1081
Run 4, test_acc: 0.1081
As shown above, for GCN without A. The experimental results of the two repeats are consistent. The poor accuracy of the model is due to the lack of normalization and other operations in the model, which do not affect whether the deterministic results are generated.
Issue Analytics
- State:
- Created a year ago
- Reactions:1
- Comments:21 (3 by maintainers)
We released a demo to do link prediction task using dgl. The code are in here
In addition, the problem also occurs when we use some functions in DGL, like
dgl.function.u_mul_e()
. The DGL we use is version 0.7.2. And I test in the 0.8.2 version, the problem also occurs.