question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

`Caught IndexError in DataLoader worker process 0` using `pip` installations

See original GitHub issue

Setup

Running on Windows Subsystem for Linux 2 (WSL2).

git clone https://github.com/Janspiry/Palette-Image-to-Image-Diffusion-Models.git
cd Palette-Image-to-Image-Diffusion-Models
conda create -n pip-palette python==3.9.*
conda activate pip-palette
pip install torch torchvision --extra-index-url https://download.pytorch.org/whl/cu113
pip install -r requirements.txt

Config

Same as #21

Directory Structure

Same as #21

Terminal

(pip-palette) sgbaird@Dell-G7:~/GitHub/Palette-Image-to-Image-Diffusion-Models$  cd /home/sgbaird/GitHub/Palette-Image-to-Image-Diffusion-Models ; /usr/bin/env /home/sgbaird/miniconda3/envs/palette/bin/python /home/sgbaird/.vscode-server/extensions/ms-python.python-2022.8.0/pythonFiles/lib/python/debugpy/launcher 36177 -- /home/sgbaird/GitHub/Palette-Image-to-Image-Diffusion-Models/run.py -p train -c config/inpainting_celebahq_dummy.json --debug 
export CUDA_VISIBLE_DEVICES=0
/home/sgbaird/GitHub/Palette-Image-to-Image-Diffusion-Models/run.py:28: UserWarning: You have chosen to use cudnn for accleration. torch.backends.cudnn.enabled=True
  warnings.warn('You have chosen to use cudnn for accleration. torch.backends.cudnn.enabled=True')
(pip-palette) sgbaird@Dell-G7:~/GitHub/Palette-Image-to-Image-Diffusion-Models$  cd /home/sgbaird/GitHub/Palette-Image-to-Image-Diffusion-Models ; /usr/bin/env /home/sgbaird/miniconda3/envs/pip-palette/bin/python /home/sgbaird/.vscode-server/extensions/ms-python.python-2022.8.0/pythonFiles/lib/python/debugpy/launcher 41379 -- /home/sgbaird/GitHub/Palette-Image-to-Image-Diffusion-Models/run.py -p train -c config/inpainting_celebahq_dummy.json --debug 
export CUDA_VISIBLE_DEVICES=0
/home/sgbaird/GitHub/Palette-Image-to-Image-Diffusion-Models/run.py:28: UserWarning: You have chosen to use cudnn for accleration. torch.backends.cudnn.enabled=True
  warnings.warn('You have chosen to use cudnn for accleration. torch.backends.cudnn.enabled=True')
  0%|                                                     | 0/16 [00:00<?, ?it/s]
Close the Tensorboard SummaryWriter.

Error

Caught IndexError in DataLoader worker process 0.
Original Traceback (most recent call last):
  File "/home/sgbaird/miniconda3/envs/pip-palette/lib/python3.9/site-packages/torch/utils/data/_utils/worker.py", line 287, in _worker_loop
    data = fetcher.fetch(index)
  File "/home/sgbaird/miniconda3/envs/pip-palette/lib/python3.9/site-packages/torch/utils/data/_utils/fetch.py", line 49, in fetch
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/home/sgbaird/miniconda3/envs/pip-palette/lib/python3.9/site-packages/torch/utils/data/_utils/fetch.py", line 49, in <listcomp>
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/home/sgbaird/miniconda3/envs/pip-palette/lib/python3.9/site-packages/torch/utils/data/dataset.py", line 471, in __getitem__
    return self.dataset[self.indices[idx]]
  File "/home/sgbaird/GitHub/Palette-Image-to-Image-Diffusion-Models/data/dataset.py", line 54, in __getitem__
    path = self.imgs[index]
IndexError: list index out of range
  File "/home/sgbaird/miniconda3/envs/pip-palette/lib/python3.9/site-packages/torch/_utils.py", line 457, in reraise
    raise exception
  File "/home/sgbaird/miniconda3/envs/pip-palette/lib/python3.9/site-packages/torch/utils/data/dataloader.py", line 1250, in _process_data
    data.reraise()
  File "/home/sgbaird/miniconda3/envs/pip-palette/lib/python3.9/site-packages/torch/utils/data/dataloader.py", line 1224, in _next_data
    return self._process_data(data)
  File "/home/sgbaird/miniconda3/envs/pip-palette/lib/python3.9/site-packages/torch/utils/data/dataloader.py", line 530, in __next__
    data = self._next_data()
  File "/home/sgbaird/miniconda3/envs/pip-palette/lib/python3.9/site-packages/tqdm/std.py", line 1195, in __iter__
    for obj in iterable:
  File "/home/sgbaird/GitHub/Palette-Image-to-Image-Diffusion-Models/models/model.py", line 106, in train_step
    for train_data in tqdm.tqdm(self.phase_loader):
  File "/home/sgbaird/GitHub/Palette-Image-to-Image-Diffusion-Models/core/base_model.py", line 45, in train
    train_log = self.train_step()
  File "/home/sgbaird/GitHub/Palette-Image-to-Image-Diffusion-Models/run.py", line 58, in main_worker
    model.train()
  File "/home/sgbaird/GitHub/Palette-Image-to-Image-Diffusion-Models/run.py", line 92, in <module>
    main_worker(0, 1, opt)
  File "/home/sgbaird/miniconda3/envs/pip-palette/lib/python3.9/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/home/sgbaird/miniconda3/envs/pip-palette/lib/python3.9/runpy.py", line 97, in _run_module_code
    _run_code(code, mod_globals, init_globals,
  File "/home/sgbaird/miniconda3/envs/pip-palette/lib/python3.9/runpy.py", line 268, in run_path
    return _run_module_code(code, init_globals, run_name,
  File "/home/sgbaird/miniconda3/envs/pip-palette/lib/python3.9/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/home/sgbaird/miniconda3/envs/pip-palette/lib/python3.9/runpy.py", line 197, in _run_module_as_main (Current frame)
    return _run_code(code, main_globals, None,

Issue Analytics

  • State:open
  • Created a year ago
  • Comments:8 (3 by maintainers)

github_iconTop GitHub Comments

1reaction
Janspirycommented, Jul 22, 2022

@ani0075, thanks for suggesting this. I will fix it asap.

0reactions
ani0075commented, Jul 21, 2022

I was able to solve the problem by getting the number of images in the batch explicitly.

temp_batch_size = len(self.path)
for idx in range(temp_batch_size):
    ret_path.append('GT_{}'.format(self.path[idx]))
    ret_result.append(self.gt_image[idx].detach().float().cpu())

    ret_path.append('Process_{}'.format(self.path[idx]))
    ret_result.append(self.visuals[idx::temp_batch_size].detach().float().cpu())
    
    ret_path.append('Out_{}'.format(self.path[idx]))
    ret_result.append(self.visuals[idx-temp_batch_size].detach().float().cpu())
Read more comments on GitHub >

github_iconTop Results From Across the Web

PyTorch "Caught IndexError in DataLoader worker process 0 ...
Main reason could be it is out of memory (not GPU memory). Check memory and swap memory are used. If yes, it is...
Read more >
Caught IndexError in DataLoader worker process 0. #6 - GitHub
Hello,I encounter a problem when running train_cnn.py in example. The error message: ~/yews/examples$ python train_cnn.py Current memory ...
Read more >
Detectron2_notebook | Kaggle
install dependencies: (use cu101 because colab has CUDA 10.1) # !pip install -U torch==1.5 ... ValueError: Caught ValueError in DataLoader worker process 0....
Read more >
Weird IndexError during validation - PyTorch Forums
Based on the error message it seems the collate_fn fails to index the returned batch samples. Are you able to iterate the DataLoader...
Read more >
shivanshusingla27/deep-learning-project-3 (v14) - Jovian
Collaborate with shivanshusingla27 on deep-learning-project-3 notebook. ... 426 427 IndexError: Caught IndexError in DataLoader worker process 0.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found