question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. ItĀ collects links to all the places you might be looking at while hunting down a tough bug.

And, if youā€™re still stuck at the end, weā€™re happy to hop on a call to see how we can help out.

Raising "list index out of range" error in torch_geometric/data/collate.py

See original GitHub issue

šŸ› Bug

When I used torch_geometric.data.Data as data structure to build an InMemoryDataset, the following error raised in the collate function.

File ".../torch_geometric/data/collate.py", line 184, in _collate
    and isinstance(elem[0], (Tensor, SparseTensor))):
IndexError: list index out of range

This error also ocurred in

File ".../torch_geometric/data/separate.py", line 92, in _separate
    and isinstance(value[0][0], (Tensor, SparseTensor))):
IndexError: list index out of range

To Reproduce

import torch
from torch_geometric.data import InMemoryDataset, download_url
from torch_geometric.data import Data

class MyOwnDataset(InMemoryDataset):
    def __init__(self, root, transform=None, pre_transform=None):
        super().__init__(root, transform, pre_transform)
        self.data, self.slices = torch.load(self.processed_paths[0])

    @property
    def raw_file_names(self):
        return ['some_file_1', 'some_file_2', ...]

    @property
    def processed_file_names(self):
        return ['data.pt']

    def download(self):
        # Download to `self.raw_dir`.
        download_url(url, self.raw_dir)

    def process(self):
        # Read data into huge `Data` list.
        data_list = []
        graph_input = Data(
            x=torch.from_numpy(x).float(),
            y=torch.from_numpy(y).float(),
            cluster=torch.from_numpy(cluster).short(),
            edge_index=torch.from_numpy(edge_index).long(),
            identifier=torch.from_numpy(identifier).float(),
            traj_len=torch.tensor([traj_lens[ind]]).int(),
            valid_len=torch.tensor([valid_lens[ind]]).int(),
            time_step_len=torch.tensor([num_valid_len_max]).int(),
            candidate_len_max=torch.tensor([num_candidate_max]).int(),
            candidate_mask=[],
            candidate=torch.from_numpy(raw_data['tar_candts'].values[0]).float(),
            candidate_gt=torch.from_numpy(raw_data['gt_candts'].values[0]).bool(),
            offset_gt=torch.from_numpy(raw_data['gt_tar_offset'].values[0]).float(),
            target_gt=torch.from_numpy(raw_data['gt_preds'].values[0][0][-1, :]).float(),
        )
        data_list.append(graph_input)
        data, slices = self.collate(data_list)
        torch.save((data, slices), self.processed_paths[0])

Expected behavior

No error raised.

Environment

  • PyG version: torch_geometric.2.0.4
  • PyTorch version: torch.1.10.1
  • OS: Ubuntu16.04
  • Python version: 3.9
  • CUDA/cuDNN version: cuda11.3
  • How you installed PyTorch and PyG: pip
  • Any other relevant information: torch-scatter.2.0.9

Additional context

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Reactions:1
  • Comments:7 (4 by maintainers)

github_iconTop GitHub Comments

1reaction
alvarogunawancommented, Mar 18, 2022

I think Iā€™ve found whatā€™s causing the issue with collate. I had an empty, non-tensor list as one of the data stores of the entries of my custom Dataset. I can see that in @McDifferenceā€™s example, they also have an empty list: candidate_mask=[]. Hereā€™s a small example that generates the same error:

import torch
from torch_geometric.datasets import FakeHeteroDataset
from torch_geometric.loader import DataLoader
    
data = FakeHeteroDataset().data
data["v0"].nontensor = [] #empty, non-tensor list
dataset = [data, data]
loader = DataLoader(dataset, batch_size=2, shuffle=True)
print(data)
for b in loader:
    print(b)

This generates the following error:

HeteroData(
  v0={
    x=[1097, 67],
    y=[1097],
    nontensor=[0]
  },
  v1={ x=[1184, 55] },
  v2={ x=[958, 75] },
  (v2, e0, v1)={ edge_index=[2, 9537] },
  (v1, e0, v2)={ edge_index=[2, 11773] },
  (v2, e0, v0)={ edge_index=[2, 9539] },
  (v0, e0, v2)={ edge_index=[2, 10915] },
  (v0, e0, v1)={ edge_index=[2, 10913] },
  (v2, e0, v2)={ edge_index=[2, 9530] }
)
Traceback (most recent call last):
  File "dataset_test.py", line 10, in <module>
    for b in loader:
  File "/home/em11824/.local/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 521, in __next__
    data = self._next_data()
  File "/home/em11824/.local/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 561, in _next_data
    data = self._dataset_fetcher.fetch(index)  # may raise StopIteration
  File "/home/em11824/.local/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 52, in fetch
    return self.collate_fn(data)
  File "/home/em11824/.local/lib/python3.8/site-packages/torch_geometric/loader/dataloader.py", line 19, in __call__
    return Batch.from_data_list(batch, self.follow_batch,
  File "/home/em11824/.local/lib/python3.8/site-packages/torch_geometric/data/batch.py", line 68, in from_data_list
    batch, slice_dict, inc_dict = collate(
  File "/home/em11824/.local/lib/python3.8/site-packages/torch_geometric/data/collate.py", line 84, in collate
    value, slices, incs = _collate(attr, values, data_list, stores,
  File "/home/em11824/.local/lib/python3.8/site-packages/torch_geometric/data/collate.py", line 184, in _collate
    and isinstance(elem[0], (Tensor, SparseTensor))):
IndexError: list index out of range

I still need some empty lists in my dataset, so I was able to fix it by changing them from normal lists to empty tensors. I think the error is caused by the following in _collate:

https://github.com/pyg-team/pytorch_geometric/blob/57c88c0c7b86fb38715b72f5914bb208c7aa11e7/torch_geometric/data/collate.py#L183-L184

When one of the data stores is an empty list, it is a Sequence. However, it doesnā€™t have any elements, so elem[0] throws an index out of range error. The issue with _separate might be similar, but I didnā€™t encounter it myself.

1reaction
zmyzxbcommented, Jan 25, 2022

Itā€™s seems that your code is based on a lower version of torch-geometric. I just solved the same problem by uninstalling pyg(from conda) and reinstall torch-geometric=1.7.2 with pip.

Read more comments on GitHub >

github_iconTop Results From Across the Web

List Index Out of Range ā€“ Python Error Message Solved
You'll get the Indexerror: list index out of range error when iterating through a list and trying to access an item that doesn't...
Read more >
How to Fix IndexError in Python - Rollbar
The IndexError in Python occurs when an item from a list is attempted to be accessed that is outside the index range of...
Read more >
Python indexerror: list index out of range Solution
They're raised when you try to access an index value inside a Python list that does not exist. In most cases, index errors...
Read more >
Does "IndexError: list index out of range" when trying to ...
If you have a list with 53 items, the last one is thelist[52] because indexing starts at 0. From Real Python: Understanding the...
Read more >
Indexerror: list Index Out of Range in Python - STechies
The only way to avoid this error is to mention the indexes of list elements properly. Example: # Declaring list list_fruits = ['apple',...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found