Error when training htc_x101_64x4d_fpn_20e_16 model on a Custom Dataset
See original GitHub issueDescribe the bug I tried training the htc_x101_64x4d_fpn_20e_16gpu model on a custom dataset. I set the ‘seg_prefix’ location to the folder that contains my segmentation maps. But soon after I start the training, it gives me the error: RuntimeError: 1only batches of spatial targets supported (non-empty 3D tensors) but got targets of size: : [1, 100, 100, 3] Also, can you please tell me what is the difference between htc without semantic and htc with semantic?
Reproduction
- What command or script did you run?
python tools/train.py ~/Prateek/Prateek/mmdetection2/mmdetection/configs/htc/htc_x101_64x4d_fpn_20e_16gpu.py
- Did you make any modifications on the code or config? Did you understand what you have modified? I modified the num_classes according to the custom dataset. I’m not sure what value of num_classes should I set in ‘semantic_head’
Environment
sys.platform: linux Python: 3.7.6 (default, Jan 8 2020, 19:59:22) [GCC 7.3.0] CUDA available: True CUDA_HOME: /usr/local/cuda NVCC: Cuda compilation tools, release 10.1, V10.1.168 GPU 0,1: GeForce RTX 2080 Ti GCC: gcc (Ubuntu 5.4.0-6ubuntu1~16.04.12) 5.4.0 20160609 PyTorch: 1.4.0 PyTorch compiling details: PyTorch built with:
- GCC 7.3
- Intel® Math Kernel Library Version 2019.0.4 Product Build 20190411 for Intel® 64 architecture applications
- Intel® MKL-DNN v0.21.1 (Git Hash 7d2fd500bc78936d1d648ca713b901012f470dbc)
- OpenMP 201511 (a.k.a. OpenMP 4.5)
- NNPACK is enabled
- CUDA Runtime 10.1
- NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_37,code=compute_37
- CuDNN 7.6.3
- Magma 2.5.1
- Build settings: BLAS=MKL, BUILD_NAMEDTENSOR=OFF, BUILD_TYPE=Release, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -fopenmp -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -O2 -fPIC -Wno-narrowing -Wall -Wextra -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Wno-stringop-overflow, DISABLE_NUMA=1, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, USE_CUDA=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, USE_STATIC_DISPATCH=OFF,
TorchVision: 0.5.0 OpenCV: 4.1.2 MMCV: 0.2.16 MMDetection: 1.0rc1+4b984a7 MMDetection Compiler: GCC 5.4 MMDetection CUDA Compiler: 10.1
Error traceback
2020-01-26 15:34:51,233 - INFO - workflow: [('train', 1)], max: 25 epochs
Traceback (most recent call last):
File "tools/train.py", line 124, in <module>
main()
File "tools/train.py", line 120, in main
timestamp=timestamp)
File "/home/user4/Prateek/Prateek/mmdetection2/mmdetection/mmdet/apis/train.py", line 133, in train_detector
timestamp=timestamp)
File "/home/user4/Prateek/Prateek/mmdetection2/mmdetection/mmdet/apis/train.py", line 319, in _non_dist_train
runner.run(data_loaders, cfg.workflow, cfg.total_epochs)
File "/home/user4/.conda/envs/open-mmlab/lib/python3.7/site-packages/mmcv/runner/runner.py", line 364, in run
epoch_runner(data_loaders[i], **kwargs)
File "/home/user4/.conda/envs/open-mmlab/lib/python3.7/site-packages/mmcv/runner/runner.py", line 268, in train
self.model, data_batch, train_mode=True, **kwargs)
File "/home/user4/Prateek/Prateek/mmdetection2/mmdetection/mmdet/apis/train.py", line 100, in batch_processor
losses = model(**data)
File "/home/user4/.conda/envs/open-mmlab/lib/python3.7/site-packages/torch/nn/modules/module.py", line 532, in __call__
result = self.forward(*input, **kwargs)
File "/home/user4/.conda/envs/open-mmlab/lib/python3.7/site-packages/torch/nn/parallel/data_parallel.py", line 150, in forward
return self.module(*inputs[0], **kwargs[0])
File "/home/user4/.conda/envs/open-mmlab/lib/python3.7/site-packages/torch/nn/modules/module.py", line 532, in __call__
result = self.forward(*input, **kwargs)
File "/home/user4/Prateek/Prateek/mmdetection2/mmdetection/mmdet/core/fp16/decorators.py", line 49, in new_func
return old_func(*args, **kwargs)
File "/home/user4/Prateek/Prateek/mmdetection2/mmdetection/mmdet/models/detectors/base.py", line 138, in forward
return self.forward_train(img, img_meta, **kwargs)
File "/home/user4/Prateek/Prateek/mmdetection2/mmdetection/mmdet/models/detectors/htc.py", line 230, in forward_train
loss_seg = self.semantic_head.loss(semantic_pred, gt_semantic_seg)
File "/home/user4/Prateek/Prateek/mmdetection2/mmdetection/mmdet/core/fp16/decorators.py", line 127, in new_func
return old_func(*args, **kwargs)
File "/home/user4/Prateek/Prateek/mmdetection2/mmdetection/mmdet/models/mask_heads/fused_semantic_head.py", line 108, in loss
loss_semantic_seg = self.criterion(mask_pred, labels)
File "/home/user4/.conda/envs/open-mmlab/lib/python3.7/site-packages/torch/nn/modules/module.py", line 532, in __call__
result = self.forward(*input, **kwargs)
File "/home/user4/.conda/envs/open-mmlab/lib/python3.7/site-packages/torch/nn/modules/loss.py", line 916, in forward
ignore_index=self.ignore_index, reduction=self.reduction)
File "/home/user4/.conda/envs/open-mmlab/lib/python3.7/site-packages/torch/nn/functional.py", line 2021, in cross_entropy
return nll_loss(log_softmax(input, 1), target, weight, None, ignore_index, None, reduction)
File "/home/user4/.conda/envs/open-mmlab/lib/python3.7/site-packages/torch/nn/functional.py", line 1840, in nll_loss
ret = torch._C._nn.nll_loss2d(input, target, weight, _Reduction.get_enum(reduction), ignore_index)
RuntimeError: 1only batches of spatial targets supported (non-empty 3D tensors) but got targets of size: : [1, 100, 100, 3]
Thanks for the help!
Issue Analytics
- State:
- Created 4 years ago
- Comments:7
Top GitHub Comments
htc without semantic
means the HTC will not do the semantic segmentation task, while defaulthtc
will also do semantic segmentation and use the segmentation features for instance segmentation. The bug means the target has an unexpected shape, you may check what the target looks like for the coco dataset and make the target of your own data has similar dimensions.@prateek-77 i used the method mentioned in #1179 to create masked images like stuffthingmap. Still i am facing same error. Have you made any other changes except this. I have opened a new issue #5608 ; please have a look and let me know your inputs.