Dataset construction for shared adjacency matrix and varying node features
See original GitHub issue❓ Questions & Help
First of all, thank you for the package - it is very well designed! I am trying to use it for my problem and aim to use PyTorch Geometric to implement new architectures.
The examples in the documentation all talk about creating Data
with edge_index
, x
, etc.
In my case, the underlying adjacency matrix is the same across the dataset and I only have varying node features (and labels).
I want to use a PyG model and setup dataset correctly such that I vary the node features and share the underlying graph topology (crucially without copying the graph len(dataset) times).
I was thinking of doing this by passing the adjacency matrix in the constructor of a Dataset class and using that in the getitem method - I wanted to know if there are any caveats to this approach or if it violates any best practices?
Issue Analytics
- State:
- Created 4 years ago
- Comments:8 (4 by maintainers)
Top GitHub Comments
So, what I mean is:
where
x
is a[batch_size, num_nodes, num_features]
tensor andedge_index
holds indices <num_nodes
. For datasets, you can then use something like thisand use the regular PyTorch
DataLoader
to create batches forx
.If you have a graph like so:
G = nx.fast_gnp_random_graph(100, 0.1)
and if you usetorch_geometric.from_networkx(G)
, because my graph is already integer ordered, and becausenetworkx
implements nodes as a dictionary, it is not ordered and hence the graph gets distorted