Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Loss cannot drop

See original GitHub issue

Thank you so much for sharing your codes. I try to employ Vit as the encoder and follow a common decoder to build a segmentation network. I train it from scratch but found the loss can’t drop since the beginning of training, and the results keep near 0. Is there any trick for training Vit correctly? Is it very important to load the pre-train model to fine-tune? Here is my configuration: patch_size=16 hidden_size=16*16*3 mlp_dim = 3072 dropout_rate = 0.1 num_heads = 12 num_layers = 12 lr=3e-4 opt=Adam weight_decay=0.0

Issue Analytics

State:
Created 3 years ago
Reactions:1
Comments:7 (3 by maintainers)

Top GitHub Comments

1reaction

QiushiYangcommented, Jan 17, 2021

@lucidrains OK. Thanks a lot for your kind suggestions 😃

1reaction

QiushiYangcommented, Jan 15, 2021

Thanks a lot for your helps. I will try to remove the conv-based decoder in the segmentation pipeline. Besides, I have another question that can original learnable position embedding (with random initialization) learn the spatial information well in spatial-sensitive tasks such as semantic segmentation?