pytorch 1.3 got a used bug
See original GitHub issueI just upgrade to pytorch1.3 (build from source) previously can training code not working anymore.
/usr/local/lib/python3.5/dist-packages/torch/optim/lr_scheduler.py:82: UserWarning: Detected call of `lr_scheduler.step()` before `optimizer.step()`. In PyTorch 1.1.0 and later, you should call them in the opposite order: `optimizer.step()` before `lr_scheduler.step()`. Failure to do this will result in PyTorch skipping the first value of the learning rate schedule.See more details at https://pytorch.org/docs/stable/optim.html#how-to-adjust-learning-rate
"https://pytorch.org/docs/stable/optim.html#how-to-adjust-learning-rate", UserWarning)
/pytorch/aten/src/ATen/native/IndexingUtils.h:20: UserWarning: indexing with dtype torch.uint8 is now deprecated, please use a dtype torch.bool instead.
" please use a dtype torch.bool instead.");
/pytorch/aten/src/ATen/native/IndexingUtils.h:20: UserWarning: indexing with dtype torch.uint8 is now deprecated, please use a dtype torch.bool instead.
" please use a dtype torch.bool instead.");
/pytorch/aten/src/ATen/native/IndexingUtils.h:20: UserWarning: indexing with dtype torch.uint8 is now deprecated, please use a dtype torch.bool instead.
" please use a dtype torch.bool instead.");
/pytorch/aten/src/ATen/native/IndexingUtils.h:20: UserWarning: indexing with dtype torch.uint8 is now deprecated, please use a dtype torch.bool instead.
" please use a dtype torch.bool instead.");
/pytorch/aten/src/ATen/native/IndexingUtils.h:20: UserWarning: indexing with dtype torch.uint8 is now deprecated, please use a dtype torch.bool instead.
" please use a dtype torch.bool instead.");
maskrcnn-benchmark_local/vendor/maskrcnn-benchmark/maskrcnn_benchmark/engine/trainer.py", line 57, in do_train
for iteration, (images, targets, _) in enumerate(data_loader, start_iter):
File "/usr/local/lib/python3.5/dist-packages/torch/utils/data/dataloader.py", line 819, in __next__
return self._process_data(data)
File "/usr/local/lib/python3.5/dist-packages/torch/utils/data/dataloader.py", line 846, in _process_data
data.reraise()
File "/usr/local/lib/python3.5/dist-packages/torch/_utils.py", line 369, in reraise
raise self.exc_type(msg)
IndexError: Caught IndexError in DataLoader worker process 0.
Original Traceback (most recent call last):
File "/usr/local/lib/python3.5/dist-packages/torch/utils/data/_utils/worker.py", line 178, in _worker_loop
data = fetcher.fetch(index)
File "/usr/local/lib/python3.5/dist-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/usr/local/lib/python3.5/dist-packages/torch/utils/data/_utils/fetch.py", line 44, in <listcomp>
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/usr/local/lib/python3.5/dist-packages/torch/utils/data/dataset.py", line 207, in __getitem__
return self.datasets[dataset_idx][sample_idx]
File "//maskrcnn-benchmark_local/vendor/maskrcnn-benchmark/maskrcnn_benchmark/data/datasets/coco.py", line 94, in __getitem__
target = target.clip_to_image(remove_empty=True)
File "s/fagangjin/work/maskrcnn-benchmark_local/vendor/maskrcnn-benchmark/maskrcnn_benchmark/structures/bounding_box.py", line 223, in clip_to_image
return self[keep]
File k/maskrcnn-benchmark_local/vendor/maskrcnn-benchmark/maskrcnn_benchmark/structures/bounding_box.py", line 208, in __getitem__
bbox.add_field(k, v[item])
File "/maskrcnn-benchmark_local/vendor/maskrcnn-benchmark/maskrcnn_benchmark/structures/segmentation_mask.py", line 553, in __getitem__
selected_instances = self.instances.__getitem__(item)
File "/maskrcnn-benchmark_local/vendor/maskrcnn-benchmark/maskrcnn_benchmark/structures/segmentation_mask.py", line 462, in __getitem__
selected_polygons.append(self.polygons[i])
IndexError: list index out of range
This bug can be seen previously, but I am sure this bug is not related that one since I just cloned a fresh new maskrcnn-benmark.
Seems only happens on pytorch 1.3?
To be more detail, it happens in these line codes:
Issue Analytics
- State:
- Created 4 years ago
- Comments:7 (2 by maintainers)
Top Results From Across the Web
pytorch 1.7.0 is much slower than pytorch 1.3.1 - vision
7.0 took more time training especially when I enabled nvidia-apex or torch.cuda.amp, it was even up to 6x time slower! (2080Ti pytorch1.3.1 with ......
Read more >Help installing 1.3 - PyTorch Forums
I am trying to install the latest Pytorch using the conda installation on my Windows 10 machine. I get the following error -...
Read more >How to load checkpoints across different versions of pytorch ...
For this reason, I am having issues when sending and receiving checkpoints between different computers, clusters and my personal mac. I wonder ...
Read more >Bug of pytorch 1.10 for NVIDIA RTX A6000 - autograd
Hi there, I ran my code below on RTX A6000 with 2 GPUs or 4 GPUs. However, the CE loss becomes nan after...
Read more >How to fix this nan bug? - autograd - PyTorch Forums
I've used torch.autograd.detect_anomaly() to debug, ... home/user/anaconda3/envs/pytorch-1.3.1/lib/python3.7/site-packages/torch/tensor.py", ...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
You are right, thanks! The warning disappears when in this line “uint8” is replaced by “bool”.
I think it is because the comparison operations return dtype has changed in PyTorch 1.2. https://github.com/pytorch/pytorch/releases
If
__getitem__
is called fromclip_to_image
, the dtype ofkeep
is changed fromtorch.uint8
totorch.bool
.so you could change the dtype checking in
__getitem__
fromitem.dtype == torch.uint8:
toitem.dtype == torch.bool:
We could also resolve the warnings by modifying the dtype in
maskrcnn_benchmark/modeling/balanced_positive_negative_sampler.py
.