Creating custom data for training
See original GitHub issueHi, I defined a custom datasets with 6 classes, i train the datasets with deeplabv3plus, the config like below:
The custom data structure as follow:
├─ann_dir (8)
│ ├─train
│ └─val
└─img_dir (24)
├─train
└─val
deeplabv3plus_r50-d8_512x1024_80k_cityscapes_SIR.py is create as follow:
norm_cfg = dict(type='SyncBN', requires_grad=True)
model = dict(
type='EncoderDecoder',
pretrained='open-mmlab://resnet50_v1c',
backbone=dict(
type='ResNetV1c',
depth=50,
num_stages=4,
out_indices=(0, 1, 2, 3),
dilations=(1, 1, 2, 4),
strides=(1, 2, 1, 1),
norm_cfg=dict(type='SyncBN', requires_grad=True),
norm_eval=False,
style='pytorch',
contract_dilation=True),
decode_head=dict(
type='DepthwiseSeparableASPPHead',
in_channels=2048,
in_index=3,
channels=512,
dilations=(1, 12, 24, 36),
c1_in_channels=256,
c1_channels=48,
dropout_ratio=0.1,
num_classes=6,
norm_cfg=dict(type='SyncBN', requires_grad=True),
align_corners=False,
loss_decode=dict(
type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0)),
auxiliary_head=dict(
type='FCNHead',
in_channels=1024,
in_index=2,
channels=256,
num_convs=1,
concat_input=False,
dropout_ratio=0.1,
num_classes=6,
norm_cfg=dict(type='SyncBN', requires_grad=True),
align_corners=False,
loss_decode=dict(
type='CrossEntropyLoss', use_sigmoid=False, loss_weight=0.4)))
train_cfg = dict()
test_cfg = dict(mode='whole')
dataset_type = 'CustomDataset'
data_root = 'data/SIRLab_mars/'
img_norm_cfg = dict(
mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
crop_size = (512, 512)
train_pipeline = [
dict(type='LoadImageFromFile'),
dict(type='LoadAnnotations'),
dict(type='Resize', img_scale=(2048, 512), ratio_range=(0.5, 2.0)),
dict(type='RandomCrop', crop_size=(512, 512), cat_max_ratio=0.75),
dict(type='RandomFlip', flip_ratio=0.5),
dict(type='PhotoMetricDistortion'),
dict(
type='Normalize',
mean=[123.675, 116.28, 103.53],
std=[58.395, 57.12, 57.375],
to_rgb=True),
dict(type='Pad', size=(512, 512), pad_val=0, seg_pad_val=255),
dict(type='DefaultFormatBundle'),
dict(type='Collect', keys=['img', 'gt_semantic_seg'])
]
test_pipeline = [
dict(type='LoadImageFromFile'),
dict(
type='MultiScaleFlipAug',
img_scale=(2048, 512),
flip=False,
transforms=[
dict(type='Resize', keep_ratio=True),
dict(type='RandomFlip'),
dict(
type='Normalize',
mean=[123.675, 116.28, 103.53],
std=[58.395, 57.12, 57.375],
to_rgb=True),
dict(type='ImageToTensor', keys=['img']),
dict(type='Collect', keys=['img'])
])
]
data = dict(
samples_per_gpu=4,
workers_per_gpu=4,
train=dict(
type='CustomDataset',
data_root='data/SIRLab_mars/',
img_dir='img_dir/train',
ann_dir='ann_dir/train',
pipeline=[
dict(type='LoadImageFromFile'),
dict(type='LoadAnnotations'),
dict(type='Resize', img_scale=(2048, 512), ratio_range=(0.5, 2.0)),
dict(type='RandomCrop', crop_size=(512, 512), cat_max_ratio=0.75),
dict(type='RandomFlip', flip_ratio=0.5),
dict(type='PhotoMetricDistortion'),
dict(
type='Normalize',
mean=[123.675, 116.28, 103.53],
std=[58.395, 57.12, 57.375],
to_rgb=True),
dict(type='Pad', size=(512, 512), pad_val=0, seg_pad_val=255),
dict(type='DefaultFormatBundle'),
dict(type='Collect', keys=['img', 'gt_semantic_seg'])
]),
val=dict(
type='CustomDataset',
data_root='data/SIRLab_mars/',
img_dir='img_dir/val',
ann_dir='ann_dir/val',
pipeline=[
dict(type='LoadImageFromFile'),
dict(
type='MultiScaleFlipAug',
img_scale=(2048, 512),
flip=False,
transforms=[
dict(type='Resize', keep_ratio=True),
dict(type='RandomFlip'),
dict(
type='Normalize',
mean=[123.675, 116.28, 103.53],
std=[58.395, 57.12, 57.375],
to_rgb=True),
dict(type='ImageToTensor', keys=['img']),
dict(type='Collect', keys=['img'])
])
]),
test=dict(
type='CustomDataset',
data_root='data/SIRLab_mars/',
img_dir='img_dir/val',
ann_dir='ann_dir/val',
pipeline=[
dict(type='LoadImageFromFile'),
dict(
type='MultiScaleFlipAug',
img_scale=(2048, 512),
flip=False,
transforms=[
dict(type='Resize', keep_ratio=True),
dict(type='RandomFlip'),
dict(
type='Normalize',
mean=[123.675, 116.28, 103.53],
std=[58.395, 57.12, 57.375],
to_rgb=True),
dict(type='ImageToTensor', keys=['img']),
dict(type='Collect', keys=['img'])
])
]))
log_config = dict(
interval=50, hooks=[dict(type='TextLoggerHook', by_epoch=False)])
dist_params = dict(backend='nccl')
log_level = 'INFO'
load_from = None
resume_from = None
workflow = [('train', 1)]
cudnn_benchmark = True
optimizer = dict(type='SGD', lr=0.01, momentum=0.9, weight_decay=0.0005)
optimizer_config = dict()
lr_config = dict(policy='poly', power=0.9, min_lr=0.0001, by_epoch=False)
total_iters = 80000
checkpoint_config = dict(by_epoch=False, interval=8000)
evaluation = dict(interval=8000, metric='mIoU')
work_dir = './work_dirs/deeplabv3plus_r50-d8_512x1024_80k_cityscapes_SIR'
gpu_ids = range(0, 1)
but, i got a problem which 5 class results is NAN after 80000 iters. the training log is pasted below:
2020-10-09 11:11:05,837 - mmseg - INFO - Loaded 1090 images
2020-10-09 11:11:06,461 - mmseg - INFO - Loaded 123 images
2020-10-09 11:11:06,462 - mmseg - INFO - Start running, work_dir: /mmsegmentation/work_dirs/deeplabv3plus_r50-d8_512x1024_80k_cityscapes_SIR
2020-10-09 11:11:06,462 - mmseg - INFO - workflow: [('train', 1)], max: 80000 iters
2020-10-09 11:11:48,158 - mmseg - INFO - Iter [50/80000] lr: 9.995e-03, eta: 14:02:00, time: 0.632, data_time: 0.005, memory: 20292, decode.loss_seg: 0.0674, decode.acc_seg: 89.3073, aux.loss_seg: 0.0679, aux.acc_seg: 87.8143, loss: 0.1354
2020-10-09 11:12:11,565 - mmseg - INFO - Iter [100/80000] lr: 9.989e-03, eta: 12:12:26, time: 0.468, data_time: 0.005, memory: 20292, decode.loss_seg: 0.0000, decode.acc_seg: 92.3593, aux.loss_seg: 0.0004, aux.acc_seg: 92.3593, loss: 0.0004
2020-10-09 11:12:47,120 - mmseg - INFO - Iter [150/80000] lr: 9.983e-03, eta: 13:23:25, time: 0.711, data_time: 0.005, memory: 20292, decode.loss_seg: 0.0000, decode.acc_seg: 90.7240, aux.loss_seg: 0.0003, aux.acc_seg: 90.7240, loss: 0.0003
...
2020-10-09 23:47:14,564 - mmseg - INFO - Iter [79700/80000] lr: 1.651e-04, eta: 0:02:49, time: 0.737, data_time: 0.006, memory: 20292, decode.loss_seg: 0.0005, decode.acc_seg: 89.8289, aux.loss_seg: 0.0005, aux.acc_seg: 89.8289, loss: 0.0010
2020-10-09 23:47:38,209 - mmseg - INFO - Iter [79750/80000] lr: 1.553e-04, eta: 0:02:21, time: 0.473, data_time: 0.006, memory: 20292, decode.loss_seg: 0.0006, decode.acc_seg: 91.7836, aux.loss_seg: 0.0005, aux.acc_seg: 91.7836, loss: 0.0011
2020-10-09 23:48:01,839 - mmseg - INFO - Iter [79800/80000] lr: 1.453e-04, eta: 0:01:53, time: 0.473, data_time: 0.006, memory: 20292, decode.loss_seg: 0.0006, decode.acc_seg: 91.9399, aux.loss_seg: 0.0005, aux.acc_seg: 91.9399, loss: 0.0012
2020-10-09 23:48:37,906 - mmseg - INFO - Iter [79850/80000] lr: 1.350e-04, eta: 0:01:24, time: 0.721, data_time: 0.006, memory: 20292, decode.loss_seg: 0.0006, decode.acc_seg: 92.1883, aux.loss_seg: 0.0005, aux.acc_seg: 92.1883, loss: 0.0012
2020-10-09 23:49:01,616 - mmseg - INFO - Iter [79900/80000] lr: 1.244e-04, eta: 0:00:56, time: 0.474, data_time: 0.006, memory: 20292, decode.loss_seg: 0.0006, decode.acc_seg: 91.7220, aux.loss_seg: 0.0005, aux.acc_seg: 91.7220, loss: 0.0011
2020-10-09 23:49:25,386 - mmseg - INFO - Iter [79950/80000] lr: 1.132e-04, eta: 0:00:28, time: 0.475, data_time: 0.006, memory: 20292, decode.loss_seg: 0.0006, decode.acc_seg: 92.6604, aux.loss_seg: 0.0006, aux.acc_seg: 92.6604, loss: 0.0012
2020-10-09 23:50:05,197 - mmseg - INFO - Saving checkpoint at 80000 iterations
[>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>] 124/123, 16.2 task/s, elapsed: 8s, ETA: 0s
2020-10-09 23:50:39,039 - mmseg - INFO - per class results:
Class IoU Acc
bedrock 100.00 100.00
stone nan nan
gravel nan nan
sand nan nan
soil nan nan
others nan nan
Summary:
Scope mIoU mAcc aAcc
global 100.00 100.00 100.00
2020-10-09 23:50:39,095 - mmseg - INFO - Exp name: deeplabv3plus_r50-d8_512x1024_80k_cityscapes_SIR.py
2020-10-09 23:50:39,095 - mmseg - INFO - Iter(val) [80000] mIoU: 1.0000, mAcc: 1.0000, aAcc: 1.0000
Issue Analytics
- State:
- Created 3 years ago
- Comments:10
Top Results From Across the Web
Custom training: walkthrough | TensorFlow Core
This tutorial shows you how to train a machine learning model with a custom training loop to categorize penguins by species. In this...
Read more >Train Custom Data · ultralytics/yolov5 Wiki - GitHub
Creating a custom model to detect your objects is an iterative process of collecting and organizing images, labeling your objects of interest, ...
Read more >Step-by-step instructions for training YOLOv7 on a Custom ...
Follow these step-by-step instructions to learn how to train YOLOv7 on custom datasets, and then test it with our sample demo on detecting...
Read more >YOLOv7 Training on Custom Data? - Medium
Train YOLOv7 on Custom Data: · Step-1: We need to create a dataset for YOLOv7 custom training. · Step-2: For labeling on custom...
Read more >Fine-tuning with custom datasets - Hugging Face
Now that we've read the data in, let's create a train/validation split: ... Now we can use a DistilBert model with a QA...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
I changed the Grayscale value of each category to 0,1,2…
@ke-dev why i got a email for this question?i dont understand。。= =