question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Unable to reproduce the results on Charades by using 4 GPUs

See original GitHub issue

Hi,

I am trying to replicate the Resnet-50-baseline experiment on the Charades dataset. I’m using the following config -

DATASET: charades
DATADIR: /ssd_scratch/cvit/avijit/datasets/charades/Charades_v1_rgb

NUM_GPUS: 4
LOG_PERIOD: 10

MODEL:
  NUM_CLASSES: 157
  MODEL_NAME: resnet_video
  BN_MOMENTUM: 0.9
  BN_EPSILON: 1.0000001e-5
  ALLOW_INPLACE_SUM: True
  ALLOW_INPLACE_RELU: True
  ALLOW_INPLACE_RESHAPE: True
  MEMONGER: True

  BN_INIT_GAMMA: 0.0
  DEPTH: 50
  VIDEO_ARC_CHOICE: 2

  MULTI_LABEL: True
  USE_AFFINE: True

RESNETS:
  NUM_GROUPS: 1  # ResNet: 1x; RESNETS: 32x
  WIDTH_PER_GROUP: 64  # ResNet: 64d; RESNETS: 4d
  TRANS_FUNC: bottleneck_transformation_3d # bottleneck_transformation, basic_transformation

TRAIN:
  DATA_TYPE: train
  BATCH_SIZE:  8 #16
  EVAL_PERIOD: 4000
  JITTER_SCALES: [256, 320]

  COMPUTE_PRECISE_BN: False
  CROP_SIZE: 224

  VIDEO_LENGTH: 32
  SAMPLE_RATE: 4
  DROPOUT_RATE: 0.3
  PARAMS_FILE: pretrained_weights/r50_k400_pretrained.pkl
  DATASET_SIZE: 7811
  RESET_START_ITER: True

TEST:
  DATA_TYPE: val
  BATCH_SIZE: 4 #16
  CROP_SIZE: 256
  SCALE: 256

  VIDEO_LENGTH: 32
  SAMPLE_RATE: 4

  DATASET_SIZE: 1814

SOLVER:
  LR_POLICY: 'steps_with_relative_lrs' # 'step', 'steps_with_lrs', 'steps_with_relative_lrs', 'steps_with_decay'
  BASE_LR: 0.01
  #STEP_SIZES: [20000, 4000]
  STEP_SIZES: [20000, 4000, 20000, 4000]
  LRS: [1, 0.1, 0.1, 0.1]
  MAX_ITER: 48000

  WEIGHT_DECAY: 0.0000125
  WEIGHT_DECAY_BN: 0.0
  MOMENTUM: 0.9
  NESTEROV: True
  SCALE_MOMENTUM: True

CHECKPOINT:
  DIR: '.'
  CHECKPOINT_PERIOD: 4000
  CONVERT_MODEL: True

NONLOCAL:
  USE_ZERO_INIT_CONV: True
  USE_BN: False
  USE_AFFINE: True
  CONV3_NONLOCAL: True
  CONV4_NONLOCAL: True
  USE_SCALE: True

As you can see, I am using 4 GPUs. So, I have reduced the batch size and learning rate by half. But the highest mAP I am getting is ~ 36.0. But if I do the test using your pre-trained model, I can get ~38 mAP. Can you please check my config file and suggest some changes necessary?

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Comments:7

github_iconTop GitHub Comments

1reaction
avijit9commented, Aug 19, 2019

Thanks a lot 😃

1reaction
avijit9commented, Aug 19, 2019

It worked like charm! Thanks again for your help.

Read more comments on GitHub >

github_iconTop Results From Across the Web

arXiv:2103.03027v3 [cs.CV] 29 May 2021
Comparison of our results with the baseline model con- taining a self-attention layer modeling relationships between all the classes and ...
Read more >
Action detection for untrimmed videos based on deep neural ...
the problem on how to represent untrimmed video using multiple modalities for action detection. We propose two cross-modality baselines ...
Read more >
Parameter Efficient Multimodal Transformers for Video ...
Empirical results on both audio and video understanding tasks demonstrate that the proposed method does indeed learn useful representations, and that multimodal ...
Read more >
Christoph Feichtenhofer
We study five different types of features and find Histograms of Oriented Gradients (HOG), a hand-crafted feature descriptor, works particularly well in terms ......
Read more >
FedScale: Benchmarking Model and System Performance of ...
Abstract. We present FedScale, a federated learning (FL) benchmarking suite with realistic datasets and a scalable runtime to enable reproducible FL re-.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found