Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Error when training htc_x101_64x4d_fpn_20e_16 model on a Custom Dataset

See original GitHub issue

Describe the bug I tried training the htc_x101_64x4d_fpn_20e_16gpu model on a custom dataset. I set the ‘seg_prefix’ location to the folder that contains my segmentation maps. But soon after I start the training, it gives me the error: RuntimeError: 1only batches of spatial targets supported (non-empty 3D tensors) but got targets of size: : [1, 100, 100, 3] Also, can you please tell me what is the difference between htc without semantic and htc with semantic?

Reproduction

What command or script did you run?

python tools/train.py ~/Prateek/Prateek/mmdetection2/mmdetection/configs/htc/htc_x101_64x4d_fpn_20e_16gpu.py

Did you make any modifications on the code or config? Did you understand what you have modified? I modified the num_classes according to the custom dataset. I’m not sure what value of num_classes should I set in ‘semantic_head’

Environment

sys.platform: linux Python: 3.7.6 (default, Jan 8 2020, 19:59:22) [GCC 7.3.0] CUDA available: True CUDA_HOME: /usr/local/cuda NVCC: Cuda compilation tools, release 10.1, V10.1.168 GPU 0,1: GeForce RTX 2080 Ti GCC: gcc (Ubuntu 5.4.0-6ubuntu1~16.04.12) 5.4.0 20160609 PyTorch: 1.4.0 PyTorch compiling details: PyTorch built with:

GCC 7.3
Intel® Math Kernel Library Version 2019.0.4 Product Build 20190411 for Intel® 64 architecture applications
Intel® MKL-DNN v0.21.1 (Git Hash 7d2fd500bc78936d1d648ca713b901012f470dbc)
OpenMP 201511 (a.k.a. OpenMP 4.5)
NNPACK is enabled
CUDA Runtime 10.1
NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_37,code=compute_37
CuDNN 7.6.3
Magma 2.5.1
Build settings: BLAS=MKL, BUILD_NAMEDTENSOR=OFF, BUILD_TYPE=Release, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -fopenmp -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -O2 -fPIC -Wno-narrowing -Wall -Wextra -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Wno-stringop-overflow, DISABLE_NUMA=1, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, USE_CUDA=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, USE_STATIC_DISPATCH=OFF,

TorchVision: 0.5.0 OpenCV: 4.1.2 MMCV: 0.2.16 MMDetection: 1.0rc1+4b984a7 MMDetection Compiler: GCC 5.4 MMDetection CUDA Compiler: 10.1

Error traceback

2020-01-26 15:34:51,233 - INFO - workflow: [('train', 1)], max: 25 epochs

Traceback (most recent call last):
  File "tools/train.py", line 124, in <module>
    main()
  File "tools/train.py", line 120, in main
    timestamp=timestamp)
  File "/home/user4/Prateek/Prateek/mmdetection2/mmdetection/mmdet/apis/train.py", line 133, in train_detector
    timestamp=timestamp)
  File "/home/user4/Prateek/Prateek/mmdetection2/mmdetection/mmdet/apis/train.py", line 319, in _non_dist_train
    runner.run(data_loaders, cfg.workflow, cfg.total_epochs)
  File "/home/user4/.conda/envs/open-mmlab/lib/python3.7/site-packages/mmcv/runner/runner.py", line 364, in run
    epoch_runner(data_loaders[i], **kwargs)
  File "/home/user4/.conda/envs/open-mmlab/lib/python3.7/site-packages/mmcv/runner/runner.py", line 268, in train
    self.model, data_batch, train_mode=True, **kwargs)
  File "/home/user4/Prateek/Prateek/mmdetection2/mmdetection/mmdet/apis/train.py", line 100, in batch_processor
    losses = model(**data)
  File "/home/user4/.conda/envs/open-mmlab/lib/python3.7/site-packages/torch/nn/modules/module.py", line 532, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/user4/.conda/envs/open-mmlab/lib/python3.7/site-packages/torch/nn/parallel/data_parallel.py", line 150, in forward
    return self.module(*inputs[0], **kwargs[0])
  File "/home/user4/.conda/envs/open-mmlab/lib/python3.7/site-packages/torch/nn/modules/module.py", line 532, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/user4/Prateek/Prateek/mmdetection2/mmdetection/mmdet/core/fp16/decorators.py", line 49, in new_func
    return old_func(*args, **kwargs)
  File "/home/user4/Prateek/Prateek/mmdetection2/mmdetection/mmdet/models/detectors/base.py", line 138, in forward
    return self.forward_train(img, img_meta, **kwargs)
  File "/home/user4/Prateek/Prateek/mmdetection2/mmdetection/mmdet/models/detectors/htc.py", line 230, in forward_train
    loss_seg = self.semantic_head.loss(semantic_pred, gt_semantic_seg)
  File "/home/user4/Prateek/Prateek/mmdetection2/mmdetection/mmdet/core/fp16/decorators.py", line 127, in new_func
    return old_func(*args, **kwargs)
  File "/home/user4/Prateek/Prateek/mmdetection2/mmdetection/mmdet/models/mask_heads/fused_semantic_head.py", line 108, in loss
    loss_semantic_seg = self.criterion(mask_pred, labels)
  File "/home/user4/.conda/envs/open-mmlab/lib/python3.7/site-packages/torch/nn/modules/module.py", line 532, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/user4/.conda/envs/open-mmlab/lib/python3.7/site-packages/torch/nn/modules/loss.py", line 916, in forward
    ignore_index=self.ignore_index, reduction=self.reduction)
  File "/home/user4/.conda/envs/open-mmlab/lib/python3.7/site-packages/torch/nn/functional.py", line 2021, in cross_entropy
    return nll_loss(log_softmax(input, 1), target, weight, None, ignore_index, None, reduction)
  File "/home/user4/.conda/envs/open-mmlab/lib/python3.7/site-packages/torch/nn/functional.py", line 1840, in nll_loss
    ret = torch._C._nn.nll_loss2d(input, target, weight, _Reduction.get_enum(reduction), ignore_index)
RuntimeError: 1only batches of spatial targets supported (non-empty 3D tensors) but got targets of size: : [1, 100, 100, 3]

Thanks for the help!

Issue Analytics

State:
Created 4 years ago
Comments:7

Top GitHub Comments

1reaction

ZwwWaynecommented, Jan 26, 2020

htc without semantic means the HTC will not do the semantic segmentation task, while default htc will also do semantic segmentation and use the segmentation features for instance segmentation. The bug means the target has an unexpected shape, you may check what the target looks like for the coco dataset and make the target of your own data has similar dimensions.

0reactions

IAMShashankkcommented, Jul 13, 2021

@prateek-77 i used the method mentioned in #1179 to create masked images like stuffthingmap. Still i am facing same error. Have you made any other changes except this. I have opened a new issue #5608 ; please have a look and let me know your inputs.