Non Square Input Resolution with YoloX - Validation mAP=0
See original GitHub issueHello, I’m training a yolox-l model on a custom coco type dataset. This works well when using square input resolution, the default is (640,640), but I also trained successfully on (800,800) and (1120,1120).
Using the exact same config but changing the input resolution to anything non square, e.g (960,800) the training results in 0 validation mAP consistently.
I used the browse_dataset.py script to check that the training data looks good (no excess distortion and bboxes located and sized properly) and as far as I see, the same resize is used for train and validation (and test), no padding and keep_ratio=False
Example results for (960,800):
config:
dist_params = dict(backend='nccl')
log_level = 'INFO'
load_from = None
resume_from = None
workflow = [('train', 1)]
conf_dir = '/ssd/Arik/PycharmProjects/mmdetection/configs/yolox/'
img_scale = (960, 800)
max_epochs = 50
num_last_epochs = 10
interval = 1
auto_resume = False
gpu_ids = [0]
optimizer = dict(
type='SGD',
lr=0.001,
momentum=0.9,
weight_decay=0.0005,
nesterov=True,
paramwise_cfg=dict(norm_decay_mult=0.0, bias_decay_mult=0.0))
optimizer_config = dict(grad_clip=None)
lr_config = dict(
policy='YOLOX',
warmup='exp',
by_epoch=False,
warmup_by_epoch=True,
warmup_ratio=1,
warmup_iters=1,
num_last_epochs=10,
min_lr_ratio=0.05)
runner = dict(type='EpochBasedRunner', max_epochs=50)
checkpoint_config = dict(interval=1)
log_config = dict(
interval=50,
hooks=[dict(type='TextLoggerHook'),
dict(type='TensorboardLoggerHook')])
custom_hooks = [
dict(type='YOLOXModeSwitchHook', num_last_epochs=10, priority=48),
dict(type='SyncNormHook', num_last_epochs=10, interval=1, priority=48),
dict(
type='ExpMomentumEMAHook',
resume_from=None,
momentum=0.0001,
priority=49)
]
model = dict(
type='YOLOX',
input_size=(960, 800),
random_size_range=(15, 25),
random_size_interval=10,
backbone=dict(type='CSPDarknet', deepen_factor=1.0, widen_factor=1.0),
neck=dict(
type='YOLOXPAFPN',
in_channels=[256, 512, 1024],
out_channels=256,
num_csp_blocks=3),
bbox_head=dict(
type='YOLOXHead', num_classes=2, in_channels=256, feat_channels=256),
train_cfg=dict(assigner=dict(type='SimOTAAssigner', center_radius=2.5)),
test_cfg=dict(score_thr=0.001, nms=dict(type='nms', iou_threshold=0.65)))
dataset_type = 'CocoDataset'
data = dict(
samples_per_gpu=4,
workers_per_gpu=2,
persistent_workers=True,
train=dict(
type='MultiImageMixDataset',
dataset=dict(
type='CocoDataset',
ann_file=
'/ssd/Arik/tel_aviv/data/coco/2022_02_07_containers_filtered/coco_labels_train.json',
img_prefix='/ssd/Arik/tel_aviv/data/labeled/',
classes=['opentrashcan', 'trashcontainer'],
pipeline=[
dict(type='LoadImageFromFile'),
dict(type='LoadAnnotations', with_bbox=True)
],
filter_empty_gt=False),
pipeline=[
dict(type='RandomFlip', flip_ratio=0.5),
dict(type='Resize', img_scale=(960, 800), keep_ratio=False),
dict(
type='FilterAnnotations',
min_gt_bbox_wh=(1, 1),
keep_empty=False),
dict(type='DefaultFormatBundle'),
dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels'])
]),
val=dict(
type='CocoDataset',
ann_file=
'/ssd/Arik/tel_aviv/data/coco/2022_02_07_containers_filtered/coco_labels_test.json',
img_prefix='/ssd/Arik/tel_aviv/data/labeled/',
classes=['opentrashcan', 'trashcontainer'],
pipeline=[
dict(type='LoadImageFromFile'),
dict(
type='MultiScaleFlipAug',
img_scale=(960, 800),
flip=False,
transforms=[
dict(type='Resize', keep_ratio=False),
dict(type='RandomFlip'),
dict(type='DefaultFormatBundle'),
dict(type='Collect', keys=['img'])
])
]),
test=dict(
type='CocoDataset',
ann_file=
'/ssd/Arik/tel_aviv/data/coco/2022_02_07_containers_filtered//coco_labels_test.json',
img_prefix='/ssd/Arik/tel_aviv/data/labeled/',
classes=['opentrashcan', 'trashcontainer'],
pipeline=[
dict(type='LoadImageFromFile'),
dict(
type='MultiScaleFlipAug',
img_scale=(960, 800),
flip=False,
transforms=[
dict(type='Resize', keep_ratio=False),
dict(type='RandomFlip'),
dict(type='DefaultFormatBundle'),
dict(type='Collect', keys=['img'])
])
]))
evaluation = dict(
save_best='auto', interval=1, dynamic_intervals=[(40, 1)], metric='bbox')
work_dir = '/ssd/Arik/tel_aviv/training/mmlab/train/yolox_960x800'
Issue Analytics
- State:
- Created 2 years ago
- Comments:5 (1 by maintainers)
Top Results From Across the Web
Non Square Input Resolution with YoloX - Validation mAP=0 ...
Hello, I'm training a yolox-l model on a custom coco type dataset. This works well when using square input resolution, the default is...
Read more >Handling non-square images · Issue #18 · yuto3o/yolox - GitHub
Greetings, I am using YOLOv4_tiny to detect objects on images in the shape 2704x1520. How to configure yaml cfg files?
Read more >B-YOLOX-S: A Lightweight Method for Underwater Object ...
A YOLOX-based underwater object detection model, B-YOLOX-S, is proposed to detect marine organisms such as echinus, holothurians, starfish, and scallops. First, ...
Read more >Improved YOLOX-X based UAV aerial ... - Research Square
A UAV aerial photography object detection algorithm YOLOX w with improved ... high-resolution feature map C2 is introduced into.
Read more >Non-square images for image classification - Cross Validated
I have a dataset of wide images: 1760x128. I've read though tutorials and books, and most of them state that input images should...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
YOLOX and its data augmentation require
(height, width)
order as input. https://github.com/open-mmlab/mmdetection/blob/98949809b7179fab9391663ee5a4ab5978425f90/mmdet/models/detectors/yolox.py#L104-L105 https://github.com/open-mmlab/mmdetection/blob/master/mmdet/datasets/pipelines/transforms.pyYou are right, that was the issue, thank you very much, I’m now able to train with non-square resolutions!
I think the shape order should be modified in the Yolox implementation to keep it consistent with the dataset processing pipeline (or at least raise some warning about that)
Thanks again, The issue is resolved.