question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

how to train deit student model in my dataset?

See original GitHub issue

thanks for your code, it really good! but I confused about distillation token when use code to train my data, can you help me?

according to github tutorial, I train deit with command 1: python main.py --model deit_base_distilled_patch16_224 --batch-size 256 --data-path /path/to/imagenet --output_dir /path/to/save this means train deit with no finetune then I tried command 2: python main.py --model deit_base_patch16_384 --batch-size 32 --finetune https://dl.fbaipublicfiles.com/deit/deit_base_patch16_224-b5f2ef4d.pth --input-size 384 --lr 5e-6 --weight-decay 1e-8 --epochs 30 --min-lr 5e-6

although I have got a good model, I still want to reproduce student model in my data. command 1&2 are all teacher model in my data, just like first row in the follow picture(maybe?) image

but how can I get 84.0% acc or even 84.5% acc?

ps: I tried to train deit with command 2 as teacher model, then train student model with command 3: python main.py --model deit_base_patch16_384 --batch-size 32 --distillation-type hard --teacher-model deit_base_distilled_patch16_224 --teacher-path /path/to/save/checkpoint.pth --distillation-type hard but sadly I get low acc.

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:5 (2 by maintainers)

github_iconTop GitHub Comments

2reactions
jorie-pengcommented, Apr 29, 2021

Hi @jorie-peng , Thanks for your question, As explained in the README you should use the command: python run_with_submitit.py --model deit_base_distilled_patch16_224 --distillation-type hard --teacher-model regnety_160 --teacher-path https://dl.fbaipublicfiles.com/deit/regnety_160-a5fe301d.pth --use_volta32 to train a model with distillation. Issue 70 may also be useful. Best, Hugo

Hi, @TouvronHugo , Thanks for your answer! this issue #70 introduced how to train for imagenet, it is a little different from my question.
As I understand it, Deit paper introduced a method to train: train teacher model in dataset first, then train student model. So, for my dataset, should I train a teacher model on my dataset first and then train student model on my dataset? why github suggests us to train deit with finetune model without distilled loss?

0reactions
lxy5513commented, May 31, 2021

@TouvronHugo Thanks.

Read more comments on GitHub >

github_iconTop Results From Across the Web

DeiT
The DeiT model was proposed in Training data-efficient image transformers & distillation through attention by Hugo Touvron, Matthieu Cord, Matthijs Douze, ...
Read more >
DeiT Data-Efficient Image Transformer | AIGuys
DeiT introduced a novel distillation technique to make ViT perform well and generalize well, without being pre-trained on huge datasets. DeiT is eco-friendly, ......
Read more >
Paper Walkthrough: DeiT (Data-efficient image Transformer)
The results hold when DeiT is pre-trained on ImageNet and transferred to other datasets, indicating the model is suitable for transfer learning.
Read more >
DeiT Explained
A Data-Efficient Image Transformer is a type of Vision Transformer for image classification tasks. The model is trained using a teacher-student strategy ...
Read more >
DeiT — MMClassification 0.25.0 documentation
More importantly, we introduce a teacher-student strategy specific to transformers. It relies on a distillation token ensuring that the student learns from ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found