IndexError in MetaPath2Vec
See original GitHub issue🐛 Bug
Hi,
I’m getting an IndexError when training MetaPath2Vec on my own dataset. The stack trace is
IndexError: Caught IndexError in DataLoader worker process 4. Original Traceback (most recent call last): File "/home/ubuntu/anaconda3/envs/GNN2/lib/python3.7/site-packages/torch/utils/data/_utils/worker.py", line 198, in _worker_loop data = fetcher.fetch(index) File "/home/ubuntu/anaconda3/envs/GNN2/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 47, in fetch return self.collate_fn(data) File "/home/ubuntu/anaconda3/envs/GNN2/lib/python3.7/site-packages/torch_geometric/nn/models/metapath2vec.py", line 157, in sample return self.pos_sample(batch), self.neg_sample(batch) File "/home/ubuntu/anaconda3/envs/GNN2/lib/python3.7/site-packages/torch_geometric/nn/models/metapath2vec.py", line 123, in pos_sample batch = adj.sample(num_neighbors=1, subset=batch).squeeze() File "/home/ubuntu/anaconda3/envs/GNN2/lib/python3.7/site-packages/torch_sparse/sample.py", line 22, in sample return col[rand] IndexError: index 1549811 is out of bounds for dimension 0 with size 1549811
From what I understand, it looks like the final entry in the rowptr
tensor in sample
is being referenced, which is an index out of bounds for the col
tensor (as it is equal to the length of the col
tensor). However, it looks like this doesn’t happen on the default AMiner dataset, despite the fact that the subset
tensor is a subset of a larger tensor in which the maximum value would index the final value in rowptr
. Therefore I think I’m misunderstanding part of the code, so any help would be very much appreciated.
Reproducing the behaviour is complicated because I can’t get the error to occur on the AMiner dataset, and I’m unable to share the dataset I’m working with. If it would be helpful for me to report back any metrics, or the results of any functions on my dataset, please let me know and I’ll do what I can.
Thank you very much for your time, and for putting together such a fantastic library!
Environment
- OS: Ubuntu 18.04.5
- Python version: 3.7.10
- PyTorch version: 1.7.1+cu101
- CUDA/cuDNN version: 10.1, V10.1.243
- GCC version: 7.5.0
Issue Analytics
- State:
- Created 3 years ago
- Reactions:1
- Comments:10 (6 by maintainers)
Top GitHub Comments
Sadly not yet, and it does not really resolve this issue, as there might be nodes that are only isolated for a few edge types, while they are connected to some nodes for other edge types. I’m trying to fix this directly in
MetaPath2Vec
.Hi @rusty1s , similar problem appeared when I test with my dataset. And I try to build a toy project which can help you to reproduce and know my problem. The project is in https://github.com/Amayama/pyg_error_toy Thanks for your help!