Trying to train with HTC, 2 bugs found
See original GitHub issueDescribe the bug
First I get bug #977 Not clear why that line changes the file name. If removed, then I get this exception:
Traceback (most recent call last):
File "tools/train.py", line 108, in <module>
main()
File "tools/train.py", line 104, in main
logger=logger)
File "/home/eduardo/dev/mmdetection/mmdet/apis/train.py", line 60, in train_detector
_non_dist_train(model, dataset, cfg, validate=validate)
File "/home/eduardo/dev/mmdetection/mmdet/apis/train.py", line 221, in _non_dist_train
runner.run(data_loaders, cfg.workflow, cfg.total_epochs)
File "/home/eduardo/miniconda2/envs/pytorch/lib/python3.7/site-packages/mmcv-0.2.12-py3.7-linux-x86_64.egg/mmcv/runner/runner.py", line 358, in run
epoch_runner(data_loaders[i], **kwargs)
File "/home/eduardo/miniconda2/envs/pytorch/lib/python3.7/site-packages/mmcv-0.2.12-py3.7-linux-x86_64.egg/mmcv/runner/runner.py", line 264, in train
self.model, data_batch, train_mode=True, **kwargs)
File "/home/eduardo/dev/mmdetection/mmdet/apis/train.py", line 38, in batch_processor
losses = model(**data)
File "/home/eduardo/miniconda2/envs/pytorch/lib/python3.7/site-packages/torch/nn/modules/module.py", line 547, in __call__
result = self.forward(*input, **kwargs)
File "/home/eduardo/miniconda2/envs/pytorch/lib/python3.7/site-packages/torch/nn/parallel/data_parallel.py", line 150, in forward
return self.module(*inputs[0], **kwargs[0])
File "/home/eduardo/miniconda2/envs/pytorch/lib/python3.7/site-packages/torch/nn/modules/module.py", line 547, in __call__
result = self.forward(*input, **kwargs)
File "/home/eduardo/dev/mmdetection/mmdet/core/fp16/decorators.py", line 49, in new_func
return old_func(*args, **kwargs)
File "/home/eduardo/dev/mmdetection/mmdet/models/detectors/base.py", line 86, in forward
return self.forward_train(img, img_meta, **kwargs)
File "/home/eduardo/dev/mmdetection/mmdet/models/detectors/htc.py", line 189, in forward_train
loss_seg = self.semantic_head.loss(semantic_pred, gt_semantic_seg)
File "/home/eduardo/dev/mmdetection/mmdet/core/fp16/decorators.py", line 127, in new_func
return old_func(*args, **kwargs)
File "/home/eduardo/dev/mmdetection/mmdet/models/mask_heads/fused_semantic_head.py", line 104, in loss
loss_semantic_seg = self.criterion(mask_pred, labels)
File "/home/eduardo/miniconda2/envs/pytorch/lib/python3.7/site-packages/torch/nn/modules/module.py", line 547, in __call__
result = self.forward(*input, **kwargs)
File "/home/eduardo/miniconda2/envs/pytorch/lib/python3.7/site-packages/torch/nn/modules/loss.py", line 916, in forward
ignore_index=self.ignore_index, reduction=self.reduction)
File "/home/eduardo/miniconda2/envs/pytorch/lib/python3.7/site-packages/torch/nn/functional.py", line 1995, in cross_entropy
return nll_loss(log_softmax(input, 1), target, weight, None, ignore_index, None, reduction)
File "/home/eduardo/miniconda2/envs/pytorch/lib/python3.7/site-packages/torch/nn/functional.py", line 1826, in nll_loss
ret = torch._C._nn.nll_loss2d(input, target, weight, _Reduction.get_enum(reduction), ignore_index)
RuntimeError: 1only batches of spatial targets supported (non-empty 3D tensors) but got targets of size: : [1, 100, 128, 3]
Reproduction
- What command or script did you run?
python tools/train.py cfg/htc_x101_32x4d_fpn_20e.py
- Did you make any modifications on the code or config? Did you understand what you have modified?
Only pointed the configuration to my dataset paths.
- What dataset did you use?
I am using my own COCO formatted dataset. It worked fine for Retinanet but does not work with HTC (config htc_x101_32x4d_fpn_20e
).
With Retinanet I didn’t provide segmentation. Here it seems mandatory (crashes if not provided), so I fill it with 4 coordinates of the bbox rectangle.
Environment
- OS: Ubuntu 18.04.2
- GCC 7.4.0
- PyTorch version 1.2.0 (conda)
- GPU model: 2080 ti
- CUDA 10 and CUDNN 7.6.0
Issue Analytics
- State:
- Created 4 years ago
- Comments:14 (3 by maintainers)
Top Results From Across the Web
[Update: Dec. 19] YouTube bugs/issues & pending ...
Here we are tracking all the bugs and problems found on YouTube and their status as well as any pending improvements that are...
Read more >MOSS VR - FULL GAME WALKTHROUGH INCLUDING ALL 3 ...
Whether you want to go through a brilliantly immersive story or you are stumped in figuring out the games puzzles ... 32K views...
Read more >Fit isn't tracking activities correctly - Android - Google Fit Help
Fit isn't tracking activities correctly. Walking, running, and biking detection isn't perfect on Fit because your device's sensors may record information ...
Read more >Time Chamber Access | Dragon Ball Z: Final Stand Wiki
After purchasing the gamepass, the player will be given unlimited access to once-a-day, 40-minute training sessions in the Hyperbolic Time ...
Read more >ManageXR | Oculus, Pico, and HTC Vive MDM
Two co-workers work on a architecture design project wearing secured VR devices. ... People line up to try the latest VR product at...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
The error comes from
seg map
label files, the.png
label files should be read withPIL.Image
, just like the VOC dataset, these label images are inpalette
mode. If we just read it with opencv or mmcv, we will get labels shape likeH,W,3
which should beH,W
, so thetorch.nn.CrossEntropy
loss function went into error.As this is the third time I see this question, perhaps it should be in the readme? A better error message will also be good.
On Fri, Aug 9, 2019, 18:09 Kai Chen notifications@github.com wrote: