question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. ItĀ collects links to all the places you might be looking at while hunting down a tough bug.

And, if youā€™re still stuck at the end, weā€™re happy to hop on a call to see how we can help out.

torch_sparse.SparseTensor.size causing problems in Data and graphSAINT

See original GitHub issue

šŸ› Bug

It appears that torch_sparse.SparseTensor causes problems when calling torch_geometric.data.Data.num_nodes. I get the following error:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-6-7db7380be762> in <module>
----> 1 data.num_nodes

~/local/miniconda3/envs/gnn/lib/python3.8/site-packages/torch_geometric/data/data.py in num_nodes(self)
    201             return self.__num_nodes__
    202         for key, item in self('x', 'pos', 'norm', 'batch'):
--> 203             return item.size(self.__cat_dim__(key, item))
    204         if hasattr(self, 'adj'):
    205             return self.adj.size(0)

~/local/miniconda3/envs/gnn/lib/python3.8/site-packages/torch_sparse/tensor.py in size(self, dim)
    212 
    213     def size(self, dim: int) -> int:
--> 214         return self.sizes()[dim]
    215 
    216     def dim(self) -> int:

TypeError: list indices must be integers or slices, not tuple

I would like to use this because my graph nodes do not have features, so I did the standard thing and put in an identity matrix. The graphs are pretty big, so I want to use sparse matrices here, otherwise, Iā€™ll run out of GPU memory pretty quick. The same error occurs in graphSAINT, because it tries to access data.num_nodes.

Iā€™m pretty new to this field and torch_geometric in general, so I was surprised this wasnā€™t working and that this issue doesnā€™t seem to have been reported before. Am I using this incorrectly, or is this something that just isnā€™t supported yet?

To Reproduce

import torch
import torch_sparse
import torch_geometric as pyg

edge_index = torch.tensor([
    [1, 0, 3, 1, 2, 0],
    [0, 1, 1, 3, 0, 2],
])

num_nodes = len(edge_index.unique())

x = torch_sparse.SparseTensor.eye(num_nodes)

data = pyg.data.Data(edge_index=edge_index, x=x)

data.num_nodes  # causes error

Environment

  • OS: Ubuntu 16.04
  • Python version: 3.8.3
  • PyTorch version: 1.6.0
  • PyTorch geometric: 1.6.1
  • PyTorch sparse: 0.6.7

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:5 (2 by maintainers)

github_iconTop GitHub Comments

1reaction
rusty1scommented, Aug 31, 2020

Hi and thanks for this issue. Using sparse node features is an interesting idea but it currently isnā€™t officially supported in PyG, and most GNN operators require to operate on dense feature representations.

Furthermore, I do not think that using sparse identity matrices as input features does help in reducing memory complexity of your model since your weight matrices in the first GNN will have a memory complexity of O(N) nonetheless.

An alternative solution is to simply use random node features of low-dimensionality, e.g.,

data.x = torch.randn(data.num_nodes num_features)

or to use a trainable embedding layer, e.g.:

data.n_id = torch.arange(data.num_nodes)
loader = GraphSAINT(...)

embedding = torch.nn.Embedding(data.num_nodes, num_features)

for data in loader:
   x = embedding(data.n_id)
0reactions
pavlin-policarcommented, Aug 31, 2020

Yeah, I donā€™t know how not having node-features would work in an inductive learning scenario.

itā€™s just that an embedding matrix is equal to performing I @ weight

That makes a lot of sense. So technically, weā€™re adding a linear transformation to the identity matrix before passing it through any convolutions. This does increase the number of parameters I guess, but should also increase model capacity?

Using random node features isnā€™t so common, but it is sufficient to learn structural features.

I tried this, and it seems to be working somewhat well.

Thanks a bunch, youā€™ve been very helpful!

Read more comments on GitHub >

github_iconTop Results From Across the Web

torch_sparse.SparseTensor.size causing problems in Data ...
SparseTensor , in which case getting num_nodes works, but in that case graphSAINT failed because torch.sparse.FloatTensor doesn't support properĀ ...
Read more >
Source code for torch_geometric.data.graph_saint
For an example of using GraphSAINT sampling, see `examples/graph_saint.py ... E = data.num_edges self.adj = SparseTensor( row=data.edge_index[0],Ā ...
Read more >
torch.sparse ā€” PyTorch 1.13 documentation
A sparse COO tensor can be constructed by providing the two tensors of indices and values, as well as the size of the...
Read more >
EXACT: SCALABLE GRAPH NEURAL NETWORKS
Hence, storing these node embeddings is the major memory bottleneck for training GNNs on large graphs. Most of the existing works towards this...
Read more >
Column/row slicing a torch sparse tensor - Stack Overflow
Hopefully this feature will be properly covered soon c.f. https://github.com/pytorch/pytorch/issues/3025 Snippet by Aron Barreira Bordin Args: xĀ ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found