Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Preparing data for HypergraphConv based on "morris/graphkerneldatasets"

See original GitHub issue

❓ Questions & Help

I would like to gain some more understanding on the HypergraphConv implemented in Pytorch-Geometric. Let say we have a dataset that belongs to https://ls11-www.cs.tu-dortmund.de/staff/morris/graphkerneldatasets. Further, let’s say we have a set of hyperedges (each hyperedge is a set of nodes). For standard graph dataset, we have a DatasetName_A.txt which is the sparse adjacency matrix, there are node features in DatasetName_node_attributes.txt (N \times N) and etc files.

Now given this information, how do we prepare the data for HypergraphConv. According to what I understand, we need the hypergraph incidence matrix, H (N \times M), where M is the number of hyperedges. We will replace the adjacency matrix with an incidence matrix, which will mostly contain details like whether a node is a part of hyperedge. Now, what happens to the original graph and original files along with it? (Thank you for reading this. After I understand fully, I would like to prepare a readme for this and contribute to Pytorch-Geometric with an example.)

Issue Analytics

State:
Created 3 years ago
Comments:8 (4 by maintainers)

Top GitHub Comments

2reactions

rusty1scommented, May 13, 2020

I see. You need some advanced mini-batching techniques for this, see here. Basically, you want to increase edge_index[0] by data.num_nodes and increase edge_index[1] by the number of hyperedges. The best way to achieve this is to generate a new data class:

class HyperEdgeData(Data):
    def __inc__(self, key, value):
        if key == 'edge_index':
            return torch.tensor([[self.num_nodes], [value[1].max().item() + 1]])
        else:
            return super(HyperEdgeData, self).__inc__(key, value)

and use this instead of torch_geometric.data.Data.

1reaction

SpaceLearnercommented, Nov 26, 2020

Hi, I found that if the number of hyper edges is larger than that of hyper nodes when using HypergraphConv the program will report RuntimeError as index is out of bounds. This seems to be a bug. Traceback of TorchScript (most recent call last): File “/home/yayaming/miniconda3/envs/sessRec/lib/python3.8/site-packages/torch_scatter/scatter.py”, line 22, in scatter_sum size[dim] = int(index.max()) + 1 out = torch.zeros(size, dtype=src.dtype, device=src.device) return out.scatter_add_(dim, index, src) ~~~~~~~~~~~~~~~~ <— HERE else: return out.scatter_add_(dim, index, src) RuntimeError: index 540 is out of bounds for dimension 0 with size 540