question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Graph convolution in mini-batch (manually defined graphs)

See original GitHub issue

❓ Questions & Help

I’ like to perform graph convolution on “manually”-defined graphs in a mini-batch manner. I’ve read the documentation (e.g., here and here), but I haven’t figured it out.

A quick example: Let’s say that we need to define a batch of graphs, of the same structure, i.e., with the same edge_index, but with different feature signals and different edge attributes.

For instance, let’s define a simple directed graph structure with the following edge_index:

import torch
from torch_geometric.data import Data as gData
import torch_geometric.nn as gnn
import numpy as np

num_nodes = 7
num_node_features = 16

edge_index = torch.tensor(np.concatenate([np.arange(num_nodes), np.roll(np.arange(num_nodes), shift=1)]).reshape(-1, num_nodes),  dtype=torch.long)
edge_index
tensor([[0, 1, 2, 3, 4, 5, 6],
        [6, 0, 1, 2, 3, 4, 5]])

Now, let’s define a simple graph convolution operator, e.g., GCNConv, that will act on such graphs:

gconv = gnn.GCNConv(in_channels=num_node_features, out_channels=32)

Then, if I define a graph signal as below:

x = torch.randn((num_nodes, num_node_features), dtype=torch.float)
print(x.size())
torch.Size([7, 16])

and pass it through gconv, I have:

y = gconv(x, edge_index)
print(y.size())
torch.Size([7, 32])

which is fine.

Now, I’d like to do the same in a mini-batch manner; i.e., to define a a batch of such signals, that along with the same edge_index will be passed through gconv.

It seems that this could be somehow done using batch, mentioned here, but I cannot find any reference on how this could be done.

The “problem” is that I need to dynamically define my graphs during training; though they will all have the same topology (in the sense of edge_index). What will change and be updated will be features and edge attributes (each graph in the batch will have a signal x with shape [num_nodes, num_node_features] and edge attributes with shape [num_nodes, num_edge_features]).

Thanks for your time.

Issue Analytics

  • State:open
  • Created 4 years ago
  • Reactions:3
  • Comments:8 (3 by maintainers)

github_iconTop GitHub Comments

5reactions
rusty1scommented, Feb 18, 2020

You have two options here: (1) Replicating your edge_index by stacking them diagonally, e.g., via:

batch_edge_index = Batch.from_data_list([Data(edge_index=edge_index)] * batch_size)

or using the node_dim property of message passing operators:

conv = GCNConv(in_channels, out_channels, node_dim=1)
conv(x, edge_index) # here, x is a tensor of size [batch_size, num_nodes, num_features]

I will try to explain this here in more detail.

1reaction
chi0tzpcommented, Feb 19, 2020

@rusty1s thank you for the clarification. Below, I have a simple example for using NNConv in batch-mode, where edge_index is fixed (all the graphs have the same topology – specifically, they are complete graphs without self-loops), as well as node and edge features are passed through the graph convolution layer (again, in batch mode).

It’d be nice if you could confirm that’s how it should be done, but of course I’m not asking that.

import torch
import torch.nn as tnn
from torch_geometric.data import Data as gData
from torch_geometric.data import Batch
import torch_geometric.nn as gnn
import numpy as np

# Build edge_index for a complete graph with given number of nodes (no self-loops)
def build_edge_idx(num_nodes):
    # Initialize edge index matrix
    E = torch.zeros((2, num_nodes * (num_nodes - 1)), dtype=torch.long)

    # Populate 1st row
    for node in range(num_nodes):
        for neighbor in range(num_nodes - 1):
            E[0, node * (num_nodes - 1) + neighbor] = node

    # Populate 2nd row
    neighbors = []
    for node in range(num_nodes):
        neighbors.append(list(np.arange(node)) + list(np.arange(node + 1, num_nodes)))
    E[1, :] = torch.Tensor([item for sublist in neighbors for item in sublist])

    return E

# Set number of nodes
num_nodes = 7

# Build edge_index -- shape: [torch.Size([2, 42])]
edge_index = build_edge_idx(num_nodes=num_nodes)

# Define input batch (node and edge features)
batch_size = 4

# Node features -- batch_x has shape: torch.Size([4, 7, 16])
num_in_node_features = 16
batch_x = torch.randn((batch_size, num_nodes, num_in_node_features), dtype=torch.float)

# Edge features -- batch_edge_features has shape: torch.Size([4, 42, 8])
num_in_edge_features = 8
batch_edge_attr = torch.randn((batch_size, edge_index.size(1), num_in_edge_features), dtype=torch.float)

# Wrap input node and edge features, along with the single edge_index, into a `torch_geometric.data.Batch` instance
l = []
for i in range(batch_size):
    l.append(gData(x=batch_x[i], edge_index=edge_index, edge_attr=batch_edge_attr[i]))
batch = Batch.from_data_list(l)

# Thus, 
# batch.x          -- shape: torch.Size([28, 16])
# batch.edge_index -- shape: torch.Size([2, 168])
# batch.edge_attr  -- shape: torch.Size([168, 8])


# Define NNConv layer
num_out_node_features = 64
nn = tnn.Sequential(tnn.Linear(num_in_edge_features, 25), tnn.ReLU(), tnn.Linear(25, num_in_node_features * num_out_node_features))
gconv = gnn.NNConv(in_channels=num_in_node_features, out_channels=num_out_node_features, nn=nn, aggr='mean')

# Forward pass
y = gconv(x=batch.x, edge_index=batch.edge_index, edge_attr=batch.edge_attr)
# y -- shape: torch.Size([28, 64])

A final question would be how the output of the convolution (i.e., y) could be wrapped into a batch again (in this case apparently without edge_attr, but only the output features and edge_index, which remains the same).

Many thanks again.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Graph convolution in mini-batch (manually defined graphs) #973
I' like to perform graph convolution on "manually"-defined graphs in a mini-batch manner. I've read the documentation (e.g., here and here), ...
Read more >
Advanced Mini-Batching — pytorch_geometric documentation
In its most general form, the PyG DataLoader will automatically increment the edge_index tensor by the cumulated number of nodes of all graphs...
Read more >
Convolutions with mini-batches of heterogeneous graph
Hi there! I am using the RGCN implementation for heterogeneous graphs and I have implemented mini-batching. The problem right now is that in ......
Read more >
Graph convolutional and attention models for entity ...
The graph convolutional network (GCN) model proposed by Kipf and Welling (2017), where convolution on graphs is carried out by aggregating ...
Read more >
analyzing the performance of graph neural networks
as efficiently as simpler Graph Convolutional Networks. (GCN), such as those inspired by (Kipf ... are faced with the challenge of defining sub-graphs...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found