torch_sparse.SparseTensor.size causing problems in Data and graphSAINT
See original GitHub issueš Bug
It appears that torch_sparse.SparseTensor
causes problems when calling torch_geometric.data.Data.num_nodes
. I get the following error:
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-6-7db7380be762> in <module>
----> 1 data.num_nodes
~/local/miniconda3/envs/gnn/lib/python3.8/site-packages/torch_geometric/data/data.py in num_nodes(self)
201 return self.__num_nodes__
202 for key, item in self('x', 'pos', 'norm', 'batch'):
--> 203 return item.size(self.__cat_dim__(key, item))
204 if hasattr(self, 'adj'):
205 return self.adj.size(0)
~/local/miniconda3/envs/gnn/lib/python3.8/site-packages/torch_sparse/tensor.py in size(self, dim)
212
213 def size(self, dim: int) -> int:
--> 214 return self.sizes()[dim]
215
216 def dim(self) -> int:
TypeError: list indices must be integers or slices, not tuple
I would like to use this because my graph nodes do not have features, so I did the standard thing and put in an identity matrix. The graphs are pretty big, so I want to use sparse matrices here, otherwise, Iāll run out of GPU memory pretty quick. The same error occurs in graphSAINT, because it tries to access data.num_nodes
.
Iām pretty new to this field and torch_geometric in general, so I was surprised this wasnāt working and that this issue doesnāt seem to have been reported before. Am I using this incorrectly, or is this something that just isnāt supported yet?
To Reproduce
import torch
import torch_sparse
import torch_geometric as pyg
edge_index = torch.tensor([
[1, 0, 3, 1, 2, 0],
[0, 1, 1, 3, 0, 2],
])
num_nodes = len(edge_index.unique())
x = torch_sparse.SparseTensor.eye(num_nodes)
data = pyg.data.Data(edge_index=edge_index, x=x)
data.num_nodes # causes error
Environment
- OS: Ubuntu 16.04
- Python version: 3.8.3
- PyTorch version: 1.6.0
- PyTorch geometric: 1.6.1
- PyTorch sparse: 0.6.7
Issue Analytics
- State:
- Created 3 years ago
- Comments:5 (2 by maintainers)
Top Results From Across the Web
torch_sparse.SparseTensor.size causing problems in Data ...
SparseTensor , in which case getting num_nodes works, but in that case graphSAINT failed because torch.sparse.FloatTensor doesn't support properĀ ...
Read more >Source code for torch_geometric.data.graph_saint
For an example of using GraphSAINT sampling, see `examples/graph_saint.py ... E = data.num_edges self.adj = SparseTensor( row=data.edge_index[0],Ā ...
Read more >torch.sparse ā PyTorch 1.13 documentation
A sparse COO tensor can be constructed by providing the two tensors of indices and values, as well as the size of the...
Read more >EXACT: SCALABLE GRAPH NEURAL NETWORKS
Hence, storing these node embeddings is the major memory bottleneck for training GNNs on large graphs. Most of the existing works towards this...
Read more >Column/row slicing a torch sparse tensor - Stack Overflow
Hopefully this feature will be properly covered soon c.f. https://github.com/pytorch/pytorch/issues/3025 Snippet by Aron Barreira Bordin Args: xĀ ...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Hi and thanks for this issue. Using sparse node features is an interesting idea but it currently isnāt officially supported in PyG, and most GNN operators require to operate on dense feature representations.
Furthermore, I do not think that using sparse identity matrices as input features does help in reducing memory complexity of your model since your weight matrices in the first GNN will have a memory complexity of O(N) nonetheless.
An alternative solution is to simply use random node features of low-dimensionality, e.g.,
or to use a trainable embedding layer, e.g.:
Yeah, I donāt know how not having node-features would work in an inductive learning scenario.
That makes a lot of sense. So technically, weāre adding a linear transformation to the identity matrix before passing it through any convolutions. This does increase the number of parameters I guess, but should also increase model capacity?
I tried this, and it seems to be working somewhat well.
Thanks a bunch, youāve been very helpful!