I need some help to reproduce DeiT-III finetuning result
See original GitHub issueHi
Thank you for sharing finetune code & training logs On IN-1k pretraining, I got similar results to your log: ViT-S 81.43 and ViT-B 82.88 But, I failed to reproduce finetune performance even with your official finetuning setting So, I would like to ask for advice or help.
Here is my fine-tune result with ViT-B on IN-1k.

I expected performance will increase as your fine-tune log, but. instead, the finetune degrades the performance.
I can’t use submitit, so I used the following command on 1 node 8 GPUs A100 machine
OMP_NUM_THREADS=1 python -m torch.distributed.launch --nproc_per_node=${num_gpus_per_node} --nnodes=${WORLD_SIZE} --node_rank=${RANK} --master_addr=${MASTER_ADDR} --master_port=${MASTER_PORT} --use_env main.py \
--model deit_base_patch16_LS \
--data-path ${local_data_path} \
--finetune ${SAVE_BASE_PATH}/pretraining/checkpoint-${epoch}.pth \
--output_dir ${SAVE_BASE_PATH}/finetune4 \
--batch-size 64 \
--print_freq 400 \
--epochs 20 \
--smoothing 0.1 \
--reprob 0.0 \
--opt adamw \
--lr 1e-5 \
--weight-decay 0.1 \
--input-size 224 \
--drop 0.0 \
--drop-path 0.2 \
--mixup 0.8 \
--cutmix 1.0 \
--unscale-lr \
--no-repeated-aug \
--aa rand-m9-mstd0.5-inc1 \
and full args printed on the command line
Namespace(ThreeAugment=False, aa='rand-m9-mstd0.5-inc1', attn_only=False, auto_resume=True, batch_size=64, bce_loss=False, clip_grad=None, color_jitter=0.3, cooldown_epochs=10, cutmix=1.0, cutmix_minmax=None, data_path='/mnt/ddn/datasets/ILSVRC2015/train/Data/CLS-LOC', data_set='IMNET', decay_epochs=30, decay_rate=0.1, device='cuda', dist_backend='nccl', dist_eval=False, dist_url='env://', distillation_alpha=0.5, distillation_tau=1.0, distillation_type='none', distributed=True, drop=0.0, drop_path=0.2, epochs=20, eval=False, finetune='/mnt/backbone-nfs/bhheo/checkpoints/deit_codebase_deit_base_patch16_LS_800epoch_reproduce/pretraining/checkpoint-800.pth', gpu=0, inat_category='name', input_size=224, log_dir='nsmlv2', log_name='finetune', lr=1e-05, lr_noise=None, lr_noise_pct=0.67, lr_noise_std=1.0, min_lr=1e-05, mixup=0.8, mixup_mode='batch', mixup_prob=1.0, mixup_switch_prob=0.5, model='deit_base_patch16_LS', model_ema=True, model_ema_decay=0.99996, model_ema_force_cpu=False, momentum=0.9, num_workers=10, opt='adamw', opt_betas=None, opt_eps=1e-08, output_dir='/mnt/backbone-nfs/bhheo/checkpoints/deit_codebase_deit_base_patch16_LS_800epoch_reproduce/finetune4', patience_epochs=10, pin_mem=True, print_freq=400, rank=0, recount=1, remode='pixel', repeated_aug=False, reprob=0.0, resplit=False, resume='', save_periods=['last2'], sched='cosine', seed=0, smoothing=0.1, src=False, start_epoch=0, teacher_model='regnety_160', teacher_path='', train_interpolation='bicubic', unscale_lr=True, warmup_epochs=5, warmup_lr=1e-06, weight_decay=0.1, world_size=8)
I think it is the same as your finetune setting. I double-checked my code but I still don’t know why the result is totally different.
I’m using different library versions torch : 1.11.0a0+b6df043, torchvision: 0.11.0a0, timm: 0.5.4
It might cause some problems, but there was no problem in pretraining and the performance difference is too severe for a simple library version issue.
I’m sorry to keep bothering you, but could you please let me know if there is something wrong with my setting? Or could you please share the ViT-B weights pretrained on IN-1k 192x192 resolution without finetuning on 224x224? If you share the weights before finetune, I can verify my finetune code without doubting my pretraining.
Issue Analytics
- State:
- Created a year ago
- Comments:23 (19 by maintainers)

Top Related StackOverflow Question
Hi
I got the result, and it is almost the same as the official log.
set_training_mode=Truesolves the fine-tune problem.Thank you for your advice @TouvronHugo
Best Heo
@TouvronHugo Oh, it looks critical I will test it ASAP
Best Heo