question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

IndexError: single positional indexer is out-of-bounds

See original GitHub issue

Describe the bug Trying to train a custom object detector when I get the error listed in the title. I think it’s because it’s not reading in my folders with the images and labels? I’ve tried 'labels' ,'/labels' and '/labels/'

Code and Data

from detecto import core, utils, visualize

# Images and XML files in separate folders
dataset = core.Dataset('labels/', 'images/')

image, target = dataset[0]
print(image, target)

model = core.Model(['bat', 'batter', 'pitch', 'field', 'player', 'scoreboard'])

model.fit(dataset)


# Specify the path to your image
image = utils.read_image('images/image0.jpg')
predictions = model.predict(image)

# predictions format: (labels, boxes, scores)
labels, boxes, scores = predictions

print(labels) 
print(boxes)
print(scores)

Stacktrace

Traceback (most recent call last):
  File "c:/Users/julis/Documents/ap-cricket/functions/train.py", line 9, in <module>
    image, target = dataset[0]
  File "C:\Users\julis\AppData\Local\Programs\Python\Python38\lib\site-packages\detecto\core.py", line 148, in __getitem__
    img_name = os.path.join(self._root_dir, self._csv.iloc[idx, 0])
  File "C:\Users\julis\AppData\Local\Programs\Python\Python38\lib\site-packages\pandas\core\indexing.py", line 873, in __getitem__
    return self._getitem_tuple(key)
  File "C:\Users\julis\AppData\Local\Programs\Python\Python38\lib\site-packages\pandas\core\indexing.py", line 1443, in _getitem_tuple
    self._has_valid_tuple(tup)
  File "C:\Users\julis\AppData\Local\Programs\Python\Python38\lib\site-packages\pandas\core\indexing.py", line 702, in _has_valid_tuple
    self._validate_key(k, i)
  File "C:\Users\julis\AppData\Local\Programs\Python\Python38\lib\site-packages\pandas\core\indexing.py", line 1352, in _validate_key
    self._validate_integer(key, axis)
  File "C:\Users\julis\AppData\Local\Programs\Python\Python38\lib\site-packages\pandas\core\indexing.py", line 1437, in _validate_integer
    raise IndexError("single positional indexer is out-of-bounds")
IndexError: single positional indexer is out-of-bounds

Environment:

  • OS: Windows 10
  • Python version: 3.8
  • Detecto version:
  • torch version: 1.5.0
  • torchvision version : 0.6.0

Additional context Image name is : ‘image0.jpg’ Label name is: ‘image0.xml’

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:6 (2 by maintainers)

github_iconTop GitHub Comments

1reaction
bgriffencommented, Feb 25, 2022

I’m also caught up at the moment but will take a look when I can too!

1reaction
bgriffencommented, Feb 24, 2022

This has come up again in another revisit of detecto – Out of curiosity @alankbi, is there a way to not make the strict requirement on image_id having to be both sequential and starting at 0 in both the training and test data? I would have thought after train_test_split, you should be able to just give both dataframes to DataLoader but it seems I then have to reassign all of the image_ids to satisfy that requirement. Presumably, the image_ids can just be abstracted away and just be created on the fly based on the filename column in each train/test dataframe going to the DataLoader. e.g. just like this here. i.e.

df['image_id'] = df.groupby(['filename']).ngroup()

I’m likely missing something behind the scenes that doesn’t allow this, though. On my most recent issue relating to this (I think)…

Begin iterating over training dataset
  0%|                                                                                                        | 0/20 [00:00<?, ?it/s]
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-1-0d2a0526aafd> in <module>
      7       d.df[column] = d.df[column].astype('int32')
      8 
----> 9 d.train()

myprogram.py in train(self, epochs)
    146 
    147     def train(self,epochs=10):
--> 148         self.losses = self.model.fit(self.train_loader,self.test_loader,epochs=epochs,verbose=True)
    149 
    150     def save(self):

~/anaconda3/lib/python3.8/site-packages/detecto-1.2.1-py3.8.egg/detecto/core.py in fit(self, dataset, val_dataset, epochs, learning_rate, momentum, weight_decay, gamma, lr_step_size, verbose)
    516 
    517             iterable = tqdm(dataset, position=0, leave=True) if verbose else dataset
--> 518             for images, targets in iterable:
    519                 self._convert_to_int_labels(targets)
    520                 images, targets = self._to_device(images, targets)

~/anaconda3/lib/python3.8/site-packages/tqdm/std.py in __iter__(self)
   1176 
   1177         try:
-> 1178             for obj in iterable:
   1179                 yield obj
   1180                 # Update and possibly print the progressbar.

~/anaconda3/lib/python3.8/site-packages/torch/utils/data/dataloader.py in __next__(self)
    519             if self._sampler_iter is None:
    520                 self._reset()
--> 521             data = self._next_data()
    522             self._num_yielded += 1
    523             if self._dataset_kind == _DatasetKind.Iterable and \

~/anaconda3/lib/python3.8/site-packages/torch/utils/data/dataloader.py in _next_data(self)
    559     def _next_data(self):
    560         index = self._next_index()  # may raise StopIteration
--> 561         data = self._dataset_fetcher.fetch(index)  # may raise StopIteration
    562         if self._pin_memory:
    563             data = _utils.pin_memory.pin_memory(data)

~/anaconda3/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py in fetch(self, possibly_batched_index)
     42     def fetch(self, possibly_batched_index):
     43         if self.auto_collation:
---> 44             data = [self.dataset[idx] for idx in possibly_batched_index]
     45         else:
     46             data = self.dataset[possibly_batched_index]

~/anaconda3/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py in <listcomp>(.0)
     42     def fetch(self, possibly_batched_index):
     43         if self.auto_collation:
---> 44             data = [self.dataset[idx] for idx in possibly_batched_index]
     45         else:
     46             data = self.dataset[possibly_batched_index]

~/anaconda3/lib/python3.8/site-packages/detecto-1.2.1-py3.8.egg/detecto/core.py in __getitem__(self, idx)
    151         object_entries = self._csv.loc[self._csv['image_id'] == idx]
    152 
--> 153         img_name = os.path.join(self._root_dir, object_entries.iloc[0, 0])
    154         image = read_image(img_name)
    155 

~/anaconda3/lib/python3.8/posixpath.py in join(a, *p)
     88                 path += sep + b
     89     except (TypeError, AttributeError, BytesWarning):
---> 90         genericpath._check_arg_types('join', a, *p)
     91         raise
     92     return path

~/anaconda3/lib/python3.8/genericpath.py in _check_arg_types(funcname, *args)
    150             hasbytes = True
    151         else:
--> 152             raise TypeError(f'{funcname}() argument must be str, bytes, or '
    153                             f'os.PathLike object, not {s.__class__.__name__!r}') from None
    154     if hasstr and hasbytes:

TypeError: join() argument must be str, bytes, or os.PathLike object, not 'int64

test.csv

width height cluster xmin ymin xmax ymax image_id filename
53 133 1 453 274 505 407 0 Position001.jpg
410 189 0 145 238 555 427 1 Position046.jpg
62 127 0 444 273 506 400 1 Position046.jpg

and train.csv

width height cluster xmin ymin xmax ymax image_id filename
53 133 1 453 274 505 407 0 Position001.jpg
410 189 0 145 238 555 427 1 Position046.jpg
62 127 0 444 273 506 400 1 Position046.jpg
56 123 0 200 265 256 388 1 Position046.jpg
413 192 0 148 226 562 418 2 Position028.jpg

As a check…

In [2]: d.train_dataset
Out[2]: <detecto.core.Dataset at 0x7f2af6a5b5e0>

In [3]: d.train_dataset[0]
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-3-c58fa18afada> in <module>
----> 1 d.train_dataset[0]

~/anaconda3/lib/python3.8/site-packages/detecto-1.2.1-py3.8.egg/detecto/core.py in __getitem__(self, idx)
    151         object_entries = self._csv.loc[self._csv['image_id'] == idx]
    152 
--> 153         img_name = os.path.join(self._root_dir, object_entries.iloc[0, 0])
    154         image = read_image(img_name)
    155 

~/anaconda3/lib/python3.8/posixpath.py in join(a, *p)
     88                 path += sep + b
     89     except (TypeError, AttributeError, BytesWarning):
---> 90         genericpath._check_arg_types('join', a, *p)
     91         raise
     92     return path

~/anaconda3/lib/python3.8/genericpath.py in _check_arg_types(funcname, *args)
    150             hasbytes = True
    151         else:
--> 152             raise TypeError(f'{funcname}() argument must be str, bytes, or '
    153                             f'os.PathLike object, not {s.__class__.__name__!r}') from None
    154     if hasstr and hasbytes:

TypeError: join() argument must be str, bytes, or os.PathLike object, not 'int64'

These being assigned here in my program…

        self.train_dataset = Dataset(self.fout_train,transform=self.default_trans)
        self.test_dataset = Dataset(self.fout_test) 
        
        self.train_loader = DataLoader(self.train_dataset, batch_size=batch_size, shuffle=True)
        self.test_loader = DataLoader(self.test_dataset, shuffle=True)

Update It comes down to iloc indexing. So when the comments/docs link say the csv file contains…

CSV file contains: filename, width, height, class, xmin, ymin, xmax, ymax

The actual order of those columns matters because of

--> 153 img_name = os.path.join(self._root_dir, object_entries.iloc[0, 0]) in core.py

General comment though - shouldn’t this just be object_entries.loc[0, 'filename'] so it’s agnostic to what order the columns are put in. Also, curious as to people’s thoughts on the nrgoup() function being run to just generate the image_ids in the backend?

Read more comments on GitHub >

github_iconTop Results From Across the Web

iloc giving 'IndexError: single positional indexer is out-of-bounds'
This happens when you index a row/column with a number that is larger than the dimensions of your dataframe ...
Read more >
IndexError: single positional indexer is out-of-bounds -
This error IndexError: single positional indexer is out-of-bounds is easy to fix and all to do with the index value that you are...
Read more >
iloc giving \'IndexError: single positional indexer is out-of ...
The code indexerror: single positional indexer is out-of-bounds, shows that you misunderstood iloc function. The value before the colon(:) ...
Read more >
single positional indexer is out-of-bounds · Issue #6785 - GitHub
2022-05-06 17:43:59,167 - freqtrade.strategy.strategy_wrapper - ERROR - Unexpected error single positional indexer is out-of-bounds calling ...
Read more >
iloc giving 'IndexError: single positional indexer is out-of-bounds'
Indexing is out of bounds here most probably because there are less than 19 columns in your Dataset, so column 18 does not...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found