Unable to reproduce the 100k results
See original GitHub issueDear Authors, Thanks for open sourcing the code. I tried pretrain 100k steps and finetune on vqav2, but my dev-test score is about 65, unlike the 70.8 on the paper.
Here is my pretrain and finetune command
python run.py with data_root=vilt_dataset/ \
num_gpus=8 num_nodes=8 task_mlm_itm whole_word_masking=True step100k \
per_gpu_batchsize=64 exp_name=pretrain
python run.py with data_root=vilt_dataset/ \
num_gpus=8 num_nodes=1 task_finetune_vqa_randaug \
per_gpu_batchsize=32 load_path="result/pretrain_seed0_from_/version_0/checkpoints/last.ckpt" \
exp_name=vqa_finetune
Generate JSON with
python run.py with data_root=vilt_dataset/ \
num_gpus=4 num_nodes=1 task_finetune_vqa \
per_gpu_batchsize=256 load_path="result/vqa_finetune_seed0_from_last/version_0/checkpoints/last.ckpt" \
test_only=True exp_name="test_vqa"
here is my pretraining and finetuning tb log
Issue Analytics
- State:
- Created 2 years ago
- Comments:8 (3 by maintainers)
Top Results From Across the Web
S.F. Election: Many results uncertain after only 10,000 votes ...
More than 100,000 ballots remained uncounted as of Wednesday. You can read a roundup on the latest status of San Francisco's ballot measures ......
Read more >Resolve blocking problem caused by lock escalation - SQL ...
For example, you run the following query to remove 100,000+ old records ... Lock escalation can't occur if a different SPID is currently ......
Read more >What should you do if you cannot reproduce published results?
Just publish. Publish your attempts to replicate the findings, documenting the discrepancies, together with the nice results you've obtained ...
Read more >2022 Black Canyon 100K LIVE - The Race for Golden Tickets
Live Results : https://live.aravaiparunning.com/#/Aravaipa Running is committed to the development of a vibrant community centered around ...
Read more >extremeultrarunning.com | extremeultrarunning.com
... at the Big Hellgate Parking lot next to Big Hellgate Creek, Hellgate 100K++ is a 66.6 mile trail race through the dark,...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
@JACKHAHA363 Thank you for your report. After carefully comparing the published (cleaned) version and our interval version of the source code, we found that we did joint training of pretraining losses in the internal version, which is done alternatively in the cleaned version.
I patched the code to do the joint training (https://github.com/dandelin/ViLT/commit/98a51e6058b1bcdd98ee6628ceacdd1c7325525f), please try with this version. Sorry for our mistake, the alternative training will need more iterations to converge.
@JACKHAHA363 Those two need different inputs. For ITM, we use unmasked inputs (and also misaligned image-text pair). So an iteration requires running the transformer three times: aligned masked text + image for MLM, aligned unmasked text + image and misaligned unmasked text + image for ITM.