Unable to reproduce benchmark results for (dtnn, qm9, random)
See original GitHub issueHere is my system information
OS Platform and Distribution (e.g., Linux Ubuntu 16.04): CentOS release 6.7 (Final) x86_64
CUDA/cuDNN version: 7.5/5.1.3
GPU model and memory: Quadro K600 1GB DDR3
Using a clean checkout from Aug 16th, 614d8a3cc43a6f139a1a8cbff1c9eb581d92b46d, running
python examples/benchmark.py -s random -m dtnn -d qm9 -t
python examples/benchmark.py -s index -m dtnn -d qm9 -t
results.csv gives
qm9,random,regression,dtnn,mean-pearson_r2_score,train,0.58606980772779427,valid,0.53837581774384768,test,0.55762063055664446,time_for_running,14634.339544057846
qm9,index,regression,dtnn,mean-pearson_r2_score,train,0.6549500533371011,valid,0.2431349582030827,test,0.41939512301155074,time_for_running,13377.745332956314
The deepchem front page has
Dataset | Model | Splitting | Train score/R2 | Valid score/R2 |
---|---|---|---|---|
qm9 | MT-NN regression | Index | 0.733 | 0.766 |
DTNN | Index | 0.918 | 0.831 | |
MT-NN regression | Random | 0.852 | 0.833 | |
DTNN | Random | 0.942 | 0.948 |
If I’m reading this right, for the validation set, I’m getting e.g. a pearson R^2 score of 0.558
while I expected to get 0.948
for the random splitting
*(added results for index splitting)
Issue Analytics
- State:
- Created 6 years ago
- Comments:14 (8 by maintainers)
Top Results From Across the Web
MoleculeNet: A Benchmark for Molecular Machine Learning
MoleculeNet provide a series of benchmark results of imple- mented machine learning algorithms using various featurizations and splits upon ...
Read more >MoleculeNet: A Benchmark for Molecular Machine Learning
Users can reproduce benchmarks locally by following directions from DeepChem. Hyperparameters were determined using Gaussian Process Optimization via ...
Read more >Dataset's chemical diversity limits the generalizability of ...
It achieved exciting performances on QM9 benchmark where 11 out of 13 properties were predicted within chemical accuracy (1 kcal/mol on total ...
Read more >Machine Learning Force Fields | Chemical Reviews
However, too much regularization may lead to underfitting (blue line), that is, the model becomes unable to reproduce the training data at ...
Read more >Unable to reproduce results with PyTorch - Stack Overflow
I can't reproduce my results each times. I try to set random seed and only use the one GPU. My results are different...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Good news, I upgraded rdkit to version
2017.03.3
and I am now getting significantly better results:I will submit a pull request to update the minimum rdkit requirements.
This is what I got - with featurizer = deepchem.feat.CoulombMatrix(26)