D2 wrapper: Strange behaviour with instance segmentation on custom dataset
See original GitHub issueThanks for sharing the wonderful work. It’s amazing to see the transformer is unifying computer vision and NLP.
I am using detectron2 wrapper.
I have used the d2 wrapper to train a detection model (bbox only) on a custom dataset (2.7k training images, 500 for eval) which yield okay performance (mAP 30%) compared with MASK RCNN .
After that I was looking to use the d2 wrapper to train an instance segmentation model. (Thanks to @jd730 )
However weird things happens every time I set MAKS_ON: True
.
I was trying to follow the recommendation that train detr on on bbox first then use weight for bbox model to fine tune a instance segm model.
I am actually expecting to observe no big changes in bbox mAP at early stage of the training for instance segmentation. However this is not the case.
I got mAP for both instance segm and bbox = 0 at the beginning of the fine-tuning.
I have set FROZEN_WEIGHTS
to a specific value, and based on my understanding to the code I reckon the bbox performance should not get affected since the bbox forward part will not be affected by MASK_ON
Instructions To Reproduce the Issue:
1. d2 config
Code changes?
No Model architecture is changed
Changes are for loading custom dataset
Config debug.yaml
file
MODEL:
META_ARCHITECTURE: "Detr"
WEIGHTS: "checkpoint/exp_106_BN_detr_256_6_6/model_0051999.pth" # this weight comes from a detr model trained a custom dataset
PIXEL_MEAN: [123.675, 116.280, 103.530]
PIXEL_STD: [58.395, 57.120, 57.375]
MASK_ON: True
RESNETS:
NORM: FrozenBN
DEPTH: 50
STRIDE_IN_1X1: False
OUT_FEATURES: ["res2", "res3", "res4", "res5"]
DETR:
GIOU_WEIGHT: 2.0
L1_WEIGHT: 5.0
NUM_OBJECT_QUERIES: 100
NO_OBJECT_WEIGHT: 0.1
FROZEN_WEIGHTS: 'checkpoint/exp_106_BN_detr_256_6_6/model_0051999.pth' # make sure the weight got frozen except for mask head
DATASETS:
TRAIN: ("coco_2017_train",)
TEST: ("coco_2017_val",)
SOLVER:
CHECKPOINT_PERIOD: 2000 #5000
IMS_PER_BATCH: 1
BASE_LR: 0.00005
STEPS: (55440,)
MAX_ITER: 92400
WARMUP_FACTOR: 1.0
WARMUP_ITERS: 10
WEIGHT_DECAY: 0.0001
OPTIMIZER: "ADAMW"
BACKBONE_MULTIPLIER: 0.1
CLIP_GRADIENTS:
ENABLED: True
CLIP_TYPE: "full_model"
CLIP_VALUE: 0.01
NORM_TYPE: 2.0
TEST:
EVAL_PERIOD: 100
INPUT:
MIN_SIZE_TRAIN: (480, 512, 544, 576, 608, 640, 672, 704, 736, 768, 800)
CROP: # https://github.com/facebookresearch/detectron2/blob/master/detectron2/config/defaults.py
ENABLED: False
TYPE: "absolute_range"
SIZE: (384, 600)
FORMAT: "RGB"
DATALOADER:
FILTER_EMPTY_ANNOTATIONS: False
NUM_WORKERS: 0
VERSION: 2
This config file is for instance segmentation.
The WEIGHT
is the checkpoint I got from training the model on custom dataset
2. command I run
python d2/train_net.py --config-file debug.yaml --eval-only
3. key observations
With MAKS_ON: False
mAP for bbox: 30%
Which is reasonable for that certain dataset
With MAKS_ON: True
mAP for both bbox and segm: 0%
This is super strange since at least the bbox mAP should NOT change much
Expected behavior:
With
- a pre-trained model (pretrained on bbox only, custom dataset) for init
- frozen other part of detr except for segm relevant components
I am actually expecting to oberve very minor changes in bbox mAP when set MAKS_ON: True
in .yaml
config file.
However this is not the case.
Environment:
I don’t think this is relevant to any environment problems.
---------------------- -------------------------------------------------------------------------
sys.platform linux
Python 3.6.9 (default, Jul 17 2020, 12:50:27) [GCC 8.4.0]
numpy 1.19.2
detectron2 0.2.1 @/home/appuser/detectron2_repo/detectron2
Compiler GCC 7.5
CUDA compiler CUDA 10.1
detectron2 arch flags 3.5, 3.7, 5.0, 5.2, 5.3, 6.0, 6.1, 7.0, 7.5
DETECTRON2_ENV_MODULE <not set>
PyTorch 1.6.0+cu101 @/home/appuser/.local/lib/python3.6/site-packages/torch
PyTorch debug build False
GPU available True
GPU 0 Tesla T4 (arch=7.5)
CUDA_HOME /usr/local/cuda
Pillow 7.2.0
torchvision 0.7.0+cu101 @/home/appuser/.local/lib/python3.6/site-packages/torchvision
torchvision arch flags 3.5, 5.0, 6.0, 7.0, 7.5
fvcore 0.1.2
cv2 3.2.0
---------------------- -------------------------------------------------------------------------
PyTorch built with:
- GCC 7.3
- C++ Version: 201402
- Intel(R) Math Kernel Library Version 2019.0.5 Product Build 20190808 for Intel(R) 64 architecture applications
- Intel(R) MKL-DNN v1.5.0 (Git Hash e2ac1fac44c5078ca927cb9b90e1b3066a0b2ed0)
- OpenMP 201511 (a.k.a. OpenMP 4.5)
- NNPACK is enabled
- CPU capability usage: AVX2
- CUDA Runtime 10.1
- NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75
- CuDNN 7.6.3
- Magma 2.5.2
- Build settings: BLAS=MKL, BUILD_TYPE=Release, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DUSE_VULKAN_WRAPPER -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, USE_CUDA=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, USE_STATIC_DISPATCH=OFF,
Issue Analytics
- State:
- Created 3 years ago
- Comments:6 (3 by maintainers)
Top GitHub Comments
@ruodingt I think this is because DETR’s segmentation part is very heavy.
@alcinos
First, I trained a box-detection model based on DETR(no D2 wrapper). In this stage, it worked well with normal total_loss and loss_ce. Then, I set the model as MODEL.DETR.FROZEN_WEIGHTS, and used the configs file above to train a instance seg model by running:
python train_net.py --config configs/detr_segm_256_6_6_torchvision.yaml --num-gpus 1
I’m sure that I have installed the detecron2 correctly.