Loss cannot drop
See original GitHub issueThank you so much for sharing your codes. I try to employ Vit as the encoder and follow a common decoder to build a segmentation network. I train it from scratch but found the loss can’t drop since the beginning of training, and the results keep near 0. Is there any trick for training Vit correctly? Is it very important to load the pre-train model to fine-tune?
Here is my configuration:
patch_size=16 hidden_size=16*16*3 mlp_dim = 3072 dropout_rate = 0.1 num_heads = 12 num_layers = 12 lr=3e-4 opt=Adam weight_decay=0.0
Issue Analytics
- State:
- Created 3 years ago
- Reactions:1
- Comments:7 (3 by maintainers)
Top Results From Across the Web
In what ways can I suffer data loss from a DROP VIEW ...
I don't think that table data could get lost, because they don't depend on views. You could lose materialized views that depend on...
Read more >Cannot Drop Database after losing iSCSI Target
Solution: Not certain this will work, but try starting the SQL Server instance using trace flag 3608.
Read more >Msg 3724...Cannot drop the procedure...because...used for ...
Client has SP I need to drop, but get error about it being used for replication. Except it isn't (as far as I...
Read more >SQL DROP TABLE statement overview
This article explores SQL DROP TABLE statement for dropping SQL Server table along with various use cases.
Read more >DROP AVAILABILITY GROUP (Transact-SQL) - Microsoft Learn
DROP AVAILABILITY GROUP cannot be executed as part of batches or within ... loss, a forced failover, or a DROP AVAILABILITY GROUP command....
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
@lucidrains OK. Thanks a lot for your kind suggestions 😃
Thanks a lot for your helps. I will try to remove the conv-based decoder in the segmentation pipeline. Besides, I have another question that can original learnable position embedding (with random initialization) learn the spatial information well in spatial-sensitive tasks such as semantic segmentation?