question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

D2 wrapper: Strange behaviour with instance segmentation on custom dataset

See original GitHub issue

Thanks for sharing the wonderful work. It’s amazing to see the transformer is unifying computer vision and NLP.

I am using detectron2 wrapper.

I have used the d2 wrapper to train a detection model (bbox only) on a custom dataset (2.7k training images, 500 for eval) which yield okay performance (mAP 30%) compared with MASK RCNN .

After that I was looking to use the d2 wrapper to train an instance segmentation model. (Thanks to @jd730 )

However weird things happens every time I set MAKS_ON: True.

I was trying to follow the recommendation that train detr on on bbox first then use weight for bbox model to fine tune a instance segm model.

I am actually expecting to observe no big changes in bbox mAP at early stage of the training for instance segmentation. However this is not the case.

I got mAP for both instance segm and bbox = 0 at the beginning of the fine-tuning.

I have set FROZEN_WEIGHTS to a specific value, and based on my understanding to the code I reckon the bbox performance should not get affected since the bbox forward part will not be affected by MASK_ON

https://github.com/facebookresearch/detr/blob/4e1a9281bc5621dcd65f3438631de25e255c4269/models/segmentation.py#L51

Instructions To Reproduce the Issue:

1. d2 config

Code changes?

No Model architecture is changed
Changes are for loading custom dataset

Config debug.yaml file

MODEL:
  META_ARCHITECTURE: "Detr"
  WEIGHTS: "checkpoint/exp_106_BN_detr_256_6_6/model_0051999.pth" # this weight comes from a detr model trained a custom dataset
  PIXEL_MEAN: [123.675, 116.280, 103.530]
  PIXEL_STD: [58.395, 57.120, 57.375]
  MASK_ON: True
  RESNETS:
    NORM: FrozenBN
    DEPTH: 50
    STRIDE_IN_1X1: False
    OUT_FEATURES: ["res2", "res3", "res4", "res5"]
  DETR:
    GIOU_WEIGHT: 2.0
    L1_WEIGHT: 5.0
    NUM_OBJECT_QUERIES: 100
    NO_OBJECT_WEIGHT: 0.1
    FROZEN_WEIGHTS: 'checkpoint/exp_106_BN_detr_256_6_6/model_0051999.pth' # make sure the weight got frozen except for mask head
DATASETS:
  TRAIN: ("coco_2017_train",)
  TEST: ("coco_2017_val",)
SOLVER:
  CHECKPOINT_PERIOD: 2000 #5000
  IMS_PER_BATCH: 1
  BASE_LR: 0.00005
  STEPS: (55440,)
  MAX_ITER: 92400
  WARMUP_FACTOR: 1.0
  WARMUP_ITERS: 10
  WEIGHT_DECAY: 0.0001
  OPTIMIZER: "ADAMW"
  BACKBONE_MULTIPLIER: 0.1
  CLIP_GRADIENTS:
    ENABLED: True
    CLIP_TYPE: "full_model"
    CLIP_VALUE: 0.01
    NORM_TYPE: 2.0
TEST:
  EVAL_PERIOD: 100
INPUT:
  MIN_SIZE_TRAIN: (480, 512, 544, 576, 608, 640, 672, 704, 736, 768, 800)
  CROP: # https://github.com/facebookresearch/detectron2/blob/master/detectron2/config/defaults.py
    ENABLED: False
    TYPE: "absolute_range"
    SIZE: (384, 600)
  FORMAT: "RGB"
DATALOADER:
  FILTER_EMPTY_ANNOTATIONS: False
  NUM_WORKERS: 0
VERSION: 2

This config file is for instance segmentation. The WEIGHT is the checkpoint I got from training the model on custom dataset

2. command I run
python d2/train_net.py --config-file debug.yaml --eval-only
3. key observations

With MAKS_ON: False

mAP for bbox: 30%
Which is reasonable for that certain dataset

With MAKS_ON: True

mAP for both bbox and segm: 0%

This is super strange since at least the bbox mAP should NOT change much

Expected behavior:

With

  1. a pre-trained model (pretrained on bbox only, custom dataset) for init
  2. frozen other part of detr except for segm relevant components

I am actually expecting to oberve very minor changes in bbox mAP when set MAKS_ON: True in .yaml config file. However this is not the case.

Environment:

I don’t think this is relevant to any environment problems.

----------------------  -------------------------------------------------------------------------
sys.platform            linux
Python                  3.6.9 (default, Jul 17 2020, 12:50:27) [GCC 8.4.0]
numpy                   1.19.2
detectron2              0.2.1 @/home/appuser/detectron2_repo/detectron2
Compiler                GCC 7.5
CUDA compiler           CUDA 10.1
detectron2 arch flags   3.5, 3.7, 5.0, 5.2, 5.3, 6.0, 6.1, 7.0, 7.5
DETECTRON2_ENV_MODULE   <not set>
PyTorch                 1.6.0+cu101 @/home/appuser/.local/lib/python3.6/site-packages/torch
PyTorch debug build     False
GPU available           True
GPU 0                   Tesla T4 (arch=7.5)
CUDA_HOME               /usr/local/cuda
Pillow                  7.2.0
torchvision             0.7.0+cu101 @/home/appuser/.local/lib/python3.6/site-packages/torchvision
torchvision arch flags  3.5, 5.0, 6.0, 7.0, 7.5
fvcore                  0.1.2
cv2                     3.2.0
----------------------  -------------------------------------------------------------------------
PyTorch built with:
  - GCC 7.3
  - C++ Version: 201402
  - Intel(R) Math Kernel Library Version 2019.0.5 Product Build 20190808 for Intel(R) 64 architecture applications
  - Intel(R) MKL-DNN v1.5.0 (Git Hash e2ac1fac44c5078ca927cb9b90e1b3066a0b2ed0)
  - OpenMP 201511 (a.k.a. OpenMP 4.5)
  - NNPACK is enabled
  - CPU capability usage: AVX2
  - CUDA Runtime 10.1
  - NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75
  - CuDNN 7.6.3
  - Magma 2.5.2
  - Build settings: BLAS=MKL, BUILD_TYPE=Release, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DUSE_VULKAN_WRAPPER -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, USE_CUDA=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, USE_STATIC_DISPATCH=OFF, 

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:6 (3 by maintainers)

github_iconTop GitHub Comments

1reaction
jd730commented, Oct 14, 2020

@ruodingt I think this is because DETR’s segmentation part is very heavy.

0reactions
Real-YeJcommented, Oct 22, 2020

@alcinos
First, I trained a box-detection model based on DETR(no D2 wrapper). In this stage, it worked well with normal total_loss and loss_ce. Then, I set the model as MODEL.DETR.FROZEN_WEIGHTS, and used the configs file above to train a instance seg model by running:

python train_net.py --config configs/detr_segm_256_6_6_torchvision.yaml --num-gpus 1

I’m sure that I have installed the detecron2 correctly.

Read more comments on GitHub >

github_iconTop Results From Across the Web

How to Train YOLOv5 Instance Segmentation with Custom Data
This blog will walk through how to train YOLOv5 for instance segmentation on a custom dataset.
Read more >
WhatsNew - FSL - FslWiki
FSLpy includes wrappers for certain FSL commands. The HCP1065 standard space DTI ... FIRST - FMRIB's Integrated Registration and Segmentation Tool.
Read more >
Efficient Annotation of Semantic Segmentation Datasets for ...
semantic segmentation is to recognise objects and their spatial pixel-level ... 2.14 Dilated convolution example. ... 3.1 Example segmentation architecture.
Read more >
Dive into Deep Learning
13.9.1 Image Segmentation and Instance Segmentation . . . . . . . . . . . . . . 587 ... 16.6.3...
Read more >
How to Train YOLOv5 Instance Segmentation on a ... - YouTube
YOLOv5 is usually associated with object detection and is one of the most popular networks in the world for that task.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found