question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Unable to reproduce benchmark results for (dtnn, qm9, random)

See original GitHub issue

Here is my system information

OS Platform and Distribution (e.g., Linux Ubuntu 16.04): CentOS release 6.7 (Final) x86_64
CUDA/cuDNN version: 7.5/5.1.3
GPU model and memory: Quadro K600 1GB DDR3

Using a clean checkout from Aug 16th, 614d8a3cc43a6f139a1a8cbff1c9eb581d92b46d, running

python examples/benchmark.py -s random -m dtnn -d qm9 -t
python examples/benchmark.py -s index -m dtnn -d qm9 -t

results.csv gives

qm9,random,regression,dtnn,mean-pearson_r2_score,train,0.58606980772779427,valid,0.53837581774384768,test,0.55762063055664446,time_for_running,14634.339544057846
qm9,index,regression,dtnn,mean-pearson_r2_score,train,0.6549500533371011,valid,0.2431349582030827,test,0.41939512301155074,time_for_running,13377.745332956314

The deepchem front page has

Dataset Model Splitting Train score/R2 Valid score/R2
qm9 MT-NN regression Index 0.733 0.766
DTNN Index 0.918 0.831
MT-NN regression Random 0.852 0.833
DTNN Random 0.942 0.948

If I’m reading this right, for the validation set, I’m getting e.g. a pearson R^2 score of 0.558 while I expected to get 0.948 for the random splitting

*(added results for index splitting)

Issue Analytics

  • State:closed
  • Created 6 years ago
  • Comments:14 (8 by maintainers)

github_iconTop GitHub Comments

1reaction
momearacommented, Aug 18, 2017

Good news, I upgraded rdkit to version 2017.03.3 and I am now getting significantly better results:

qm9,random,regression,dtnn,mean-pearson_r2_score,train,0.90034446761651732,valid,0.79499177162278145,test,0.82222053202103251,time_for_running,13429.928102970123

I will submit a pull request to update the minimum rdkit requirements.

0reactions
Dgelemicommented, Aug 18, 2017

This is what I got - with featurizer = deepchem.feat.CoulombMatrix(26)

qm9 | random | regression | dtnn | mean-pearson_r2_score | train | 0.929191 | valid | 0.82281 | test | 0.819825 | time_for_running | 11448.62

Read more comments on GitHub >

github_iconTop Results From Across the Web

MoleculeNet: A Benchmark for Molecular Machine Learning
MoleculeNet provide a series of benchmark results of imple- mented machine learning algorithms using various featurizations and splits upon ...
Read more >
MoleculeNet: A Benchmark for Molecular Machine Learning
Users can reproduce benchmarks locally by following directions from DeepChem. Hyperparameters were determined using Gaussian Process Optimization via ...
Read more >
Dataset's chemical diversity limits the generalizability of ...
It achieved exciting performances on QM9 benchmark where 11 out of 13 properties were predicted within chemical accuracy (1 kcal/mol on total ...
Read more >
Machine Learning Force Fields | Chemical Reviews
However, too much regularization may lead to underfitting (blue line), that is, the model becomes unable to reproduce the training data at ...
Read more >
Unable to reproduce results with PyTorch - Stack Overflow
I can't reproduce my results each times. I try to set random seed and only use the one GPU. My results are different...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found