How to use MOSES train/test/testSF dataset in Torchdrug
See original GitHub issueTorchDrug
implements MOSES
dataset, but doesn’t distinguish between train
/ test
/ testSF
which MOSES has. To train GCPN on Moses, I think the correct order is to pretrain the model by train
dataset at first, then train it on test
/ testSF
dataset and finally generate the molecules. But how to do this in TorchDrug
? There’s only one dataset named MOSES
.
I have this question because when I generate molecules by MOSES, the statistics doesn’t look correct if compared to other models on MOSEC, especially the Scaf/Test
property in the table, which tries to find out if there are same scaffolds in test dataset and generated molecules. It’s 0 for GCPN model after training on TorchDrug
, following the tutorial. I think the problem is that TorchDrug
only uses the train
dataset but not test
dataset. How can I explicitly use it? Thanks in advance!
Issue Analytics
- State:
- Created 2 years ago
- Comments:5 (3 by maintainers)
Top GitHub Comments
Hi! There is a predefined split for MOSES implemented in TorchDrug. I am not sure if this is what you want. You can get it by
Sorry I am not an expert in molecule generation. Maybe @shichence knows more about the dataset and evaluation setting on MOSES?
test_set
, load the checkpoint and finetune the model ontest_set
.Pretrain:
Finetune:
The same procedure can be applied to resume training.