Error finetuning from pretrained checkpoint
See original GitHub issueHi all, I’m running into an error when trying to fine-tune from one of the pretrained checkpoints.
Code
!mkdir "$output"
!wget -q -O "$output/checkpoint.pth" https://dl.fbaipublicfiles.com/dino/dino_deitsmall16_pretrain/dino_deitsmall16_pretrain.pth
!python -m torch.distributed.launch \
--nproc_per_node=1 ./dino/main_dino.py \
--arch deit_small \
--data_path "$input" \
--output_dir "$output"
Error
| distributed init (rank 0): env://
git:
sha: 8aa93fdc90eae4b183c4e3c005174a9f634ecfbf, status: clean, branch: main
arch: deit_small
batch_size_per_gpu: 64
...
...
Student and Teacher are built: they are both deit_small network.
Loss, optimizer and schedulers ready.
Found checkpoint at ./drive/MyDrive/DINO/checkpoint.pth
=> failed to load student from checkpoint './drive/MyDrive/DINO/checkpoint.pth'
=> failed to load teacher from checkpoint './drive/MyDrive/DINO/checkpoint.pth'
=> failed to load optimizer from checkpoint './drive/MyDrive/DINO/checkpoint.pth'
=> failed to load fp16_scaler from checkpoint './drive/MyDrive/DINO/checkpoint.pth'
=> failed to load dino_loss from checkpoint './drive/MyDrive/DINO/checkpoint.pth'
Any suggestions would be very much appreciated.
Issue Analytics
- State:
- Created 2 years ago
- Comments:12 (2 by maintainers)
Top Results From Across the Web
Unable to read from a tensorflow checkpoint for finetuning
1 Answer 1 · A layer in the tf compute graph is renamed. i.e the name of the layer in the pre-trained checkpoint...
Read more >Error while training a custom pretrained model - Beginners
Hi,. I trained a model as follows: checkpoint = “bert-base-uncased” tokenizer = AutoTokenizer.from_pretrained(checkpoint)
Read more >fine-tuning can distort pretrained features - OpenReview
error but worse ID error than fine-tuning (Section 3.3). ... Feature quality: We use a checkpoint of MoCo-v1 that got 10% worse accuracy...
Read more >Fine-tuning a BERT model | Text - TensorFlow
The following directory contains the BERT model's configuration, vocabulary, and a pre-trained checkpoint used in this tutorial:.
Read more >huggingface load finetuned model - You.com | The AI Search ...
Error while loading the checkpoints ... provides a Trainer class to help you fine-tune any of the pretrained models it provides on your...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Hi @ymathildecaron31
Thank you so much for your wonderful work and all the time you’re putting into helping others build on it.
It looks like the checkpoints were trained on a slightly different version of the released code. Luckily it’s not difficult to change the names of the affected keys.
Now training starts at a much smaller loss and I see the following message.