question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Unify molecule graph data

See original GitHub issue

Currently, I seem that there are two data classes, ConvMol and WeaveMol for a molecule graph and these classes are designed for each specific model (like ConvMol <- GraphConvModel)

I think that this situation is not good and we should unify one data class for molecule graph. Generally, all models receive the same graph data class like DGL or PyG. This design leads to make codes more readable and maintainable.

I recommend DeepChem has only one molecule graph data and all GNN/GCN models receive it. And, current graph data is too complicated, so I also recommend the graph data is simpler like PyG. The PyG design is from here.

For molecule graph, I imagine the data class which holds the following attributes.

  • data.x: Node feature matrix with shape [num_nodes, num_node_features]
  • data.edge_index: Graph connectivity in COO format with shape [2, num_edges] and type torch.long
  • data.edge_attr: Edge feature matrix with shape [num_edges, num_edge_features]
  • data.graph_attr:Graph feature with shape [1, num_graph_features]
  • data.y: Target to train against graph-level targets of shape [1, *]

Issue Analytics

  • State:open
  • Created 3 years ago
  • Comments:6 (6 by maintainers)

github_iconTop GitHub Comments

1reaction
peastmancommented, Jul 1, 2020

I’ve also wondered whether we really need different featurizers for them. Having a unified representation for all graph models would be very nice.

1reaction
rbharathcommented, Jul 1, 2020

I quite like this suggestion! Ideally our Graph class should be easily interoperable with PyG’s class (perhaps a Graph.to_pytorch_geometric_graph() method)

This will take a little bit of work to do right though. It would be great to field test this idea in Jaxchem and then upstream into DeepChem.

Read more comments on GitHub >

github_iconTop Results From Across the Web

A unified graph model based on molecular data binning for ...
Molecular disease subtype discovery from omics data is an important research ... A unified graph model based on molecular data binning for disease...
Read more >
Unified 2D and 3D Pre-Training of Molecular ...
To effectively unify 2D and 3D information, we design several pre-training tasks:
Read more >
A unified graph model based on molecular data binning ...
In this paper, a novel robust distance metric (ROMDEX) is proposed to construct similarity graphs for molecular disease subtypes from omics data, ...
Read more >
Could graph neural networks learn better molecular ...
Graph neural networks (GNN) has been considered as an attractive modelling method for molecular property prediction, and numerous studies ...
Read more >
Masked graph modeling for molecule generation
De novo, in-silico design of molecules is a challenging problem with applications in drug discovery and material design.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found