Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Training logs for Swin-B/S/L

See original GitHub issue

Hi authors,

Following the official training command below, I observed unstable training loss and accuracies around epoch # 20.

python -m torch.distributed.launch --nproc_per_node 8 --master_port 12345 main.py \ --cfg configs/swin_base_patch4_window7_224.yaml --data-path <imagenet-path> --batch-size 64 \ --accumulation-steps 2 [--use-checkpoint]

Can you please share the training logs for Swin-B? And if more logs are available, please consider sharing them as well.

TIA!

Issue Analytics

State:
Created 2 years ago
Comments:11 (4 by maintainers)

Top GitHub Comments

1reaction

zeliu98commented, Jan 1, 2022

Hi @lcmeng, the model is trained using the default mixed-precision (O1). We doesn’t deal with the logging of amp so the loss scaling info is not wrote to the log file.

Your environment seems correct and I am not very clear about the reason about your problem. According to other users’ feedback, installing apex from source may be error-prone, so I just suspect that there might be something wrong with your apex. The nvcr-21.05 docker has installed apex by itself, so you can try it first. The version of CUDA and pytorch is ok and I have checked.

Besides, you can share me with your log so I can look into it further.

0reactions

lcmengcommented, Jan 28, 2022

@zeliu98, thank you for the explanation. I’ve added some TensorBoard code to Swin to generate visualization of the training. It seems the drop of accuracy near the peak LR is correlated with the explosion of gradient (norm).

Please see the attached screenshots. The LR appeared to be doubled, due to accumulation step = 2. It is in fact the same as the recommended setup.

(1) The trace of gradient norm over global steps. It increased very aggressively after the initial “flat” phase. Screen Shot 2022-01-26 at 12 46 44 PM

(2) Max top-1 accuracy happened at Epoch 15. Screen Shot 2022-01-26 at 12 51 40 PM

(3) Using the recommended LR schedule Screen Shot 2022-01-26 at 12 51 02 PM

Top Results From Across the Web

MB SwingBall Trainer at Mike Bender Golf Training Aids

Clenching the MB SwingBall Trainer between the forearms trains the player to keep their arms together during the swing.

swingball replacement ball

Tourna Ball And String Replacement for Tennis Trainers - universal fit · Silfrae Solo Tennis Trainer Tennis Rebounder Self Practice Tennis Training Tool...

PowerMax Swing Ball

PowerMax Swing Ball. For swinging, slamming, and rotation drills. Typically ships within 1-2 business days. $93.00.

Tennis Swingball

Shop for Tennis Swingball at Walmart.com. Save money. Live better. ... Tennis training Swingball Replacement Tether Rope String Adjustable.

Swingball All Surface Pro

TETHER TENNIS: Swingball is the award winning portable tether tennis game that's easy to set up with its all surface base & compact...