Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Dataset construction for shared adjacency matrix and varying node features

See original GitHub issue

❓ Questions & Help

First of all, thank you for the package - it is very well designed! I am trying to use it for my problem and aim to use PyTorch Geometric to implement new architectures.

The examples in the documentation all talk about creating Data with edge_index, x, etc. In my case, the underlying adjacency matrix is the same across the dataset and I only have varying node features (and labels).

I want to use a PyG model and setup dataset correctly such that I vary the node features and share the underlying graph topology (crucially without copying the graph len(dataset) times).

I was thinking of doing this by passing the adjacency matrix in the constructor of a Dataset class and using that in the getitem method - I wanted to know if there are any caveats to this approach or if it violates any best practices?

Issue Analytics

State:
Created 4 years ago
Comments:8 (4 by maintainers)

Top GitHub Comments

5reactions

rusty1scommented, Feb 18, 2020

So, what I mean is:

conv = GCNConv(in_channels, out_channels, node_dim=1)
conv(x, edge_index)

where x is a [batch_size, num_nodes, num_features] tensor and edge_index holds indices < num_nodes. For datasets, you can then use something like this

class MyDataset(torch.utils.Dataset)
    def __init__(...):
        self.edge_index = ...
        self.x_all = ...
    def __getitem__(self, idx):
        return x_all[idx]

and use the regular PyTorch DataLoader to create batches for x.

0reactions

chnshcommented, Mar 12, 2020

If you have a graph like so: G = nx.fast_gnp_random_graph(100, 0.1) and if you use torch_geometric.from_networkx(G), because my graph is already integer ordered, and because networkx implements nodes as a dictionary, it is not ordered and hence the graph gets distorted