Unify molecule graph data
See original GitHub issueCurrently, I seem that there are two data classes, ConvMol
and WeaveMol
for a molecule graph and these classes are designed for each specific model (like ConvMol
<- GraphConvModel
)
I think that this situation is not good and we should unify one data class for molecule graph. Generally, all models receive the same graph data class like DGL or PyG. This design leads to make codes more readable and maintainable.
I recommend DeepChem has only one molecule graph data and all GNN/GCN models receive it. And, current graph data is too complicated, so I also recommend the graph data is simpler like PyG. The PyG design is from here.
For molecule graph, I imagine the data class which holds the following attributes.
data.x
: Node feature matrix with shape [num_nodes, num_node_features]data.edge_index
: Graph connectivity in COO format with shape [2, num_edges] and type torch.longdata.edge_attr
: Edge feature matrix with shape [num_edges, num_edge_features]data.graph_attr
:Graph feature with shape [1, num_graph_features]data.y
: Target to train against graph-level targets of shape [1, *]
Issue Analytics
- State:
- Created 3 years ago
- Comments:6 (6 by maintainers)
Top Results From Across the Web
A unified graph model based on molecular data binning for ...
Molecular disease subtype discovery from omics data is an important research ... A unified graph model based on molecular data binning for disease...
Read more >Unified 2D and 3D Pre-Training of Molecular ...
To effectively unify 2D and 3D information, we design several pre-training tasks:
Read more >A unified graph model based on molecular data binning ...
In this paper, a novel robust distance metric (ROMDEX) is proposed to construct similarity graphs for molecular disease subtypes from omics data, ...
Read more >Could graph neural networks learn better molecular ...
Graph neural networks (GNN) has been considered as an attractive modelling method for molecular property prediction, and numerous studies ...
Read more >Masked graph modeling for molecule generation
De novo, in-silico design of molecules is a challenging problem with applications in drug discovery and material design.
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
I’ve also wondered whether we really need different featurizers for them. Having a unified representation for all graph models would be very nice.
I quite like this suggestion! Ideally our Graph class should be easily interoperable with PyG’s class (perhaps a
Graph.to_pytorch_geometric_graph()
method)This will take a little bit of work to do right though. It would be great to field test this idea in Jaxchem and then upstream into DeepChem.