The config.yaml settings in experiments/ are quite different from the settings in the paper
See original GitHub issue@jerrybai1995 @vkoltun @zkolter
Thanks for the nice work! I am quite interested in the MDEQ model and try to train it on the cifar10 dataset. However, I can’t get the expected performances reported in the paper. Specifically, the accuracy of MDEQ-small(ours)
without data augmentation in Table 1 is 87.1% while I only get 80.3% after removing the data augmentation code in tools/cls_train.py
.
I checked the experiments\cifar\cls_mdeq_TINY.yaml
and experiments\cifar\cls_mdeq_LARGE.yaml
carefully and found that the settings are quite different from the settings in the paper (i.e. Table 4 in Appendix A), including dropout rate, For-Backword Thresholds, group num of GroupNormalization and so on. I have adjusted the setting in Table 4 while the performance is not improved. I have no idea with whether the LR_STEP
or other settings in the .yaml
file which can not be found in the paper harm the training process.
Is there any suggestion to reimplement the performance(87.1% ± 0.4%) in Table 1 in the paper? In addition, I will appreciate that if you can share the YAML files used in your paper experiments since I find the config.yaml
settings of cifar10
and imagenet
in experiments/
are quite different from the settings in Table 4 in Appendix A of the original paper.
Issue Analytics
- State:
- Created 3 years ago
- Comments:6 (3 by maintainers)
Top GitHub Comments
After adding the
weight_decay
param discussed above and trying with the newest code & config(22731498bcd5524976664000367caa179fe6c56e), I can finally get ~84.5% accuracy on CIFAR-10 without augmentation, which is reasonable according to jerrybai1995’s response. I will close this issue.Thanks for the quick response and bug fix again 💯.
Ah, good point about the WD! Another thing that I overlooked when I cleaned up the code. Thanks a lot for the pointer.
And yes— do let me know if you still can’t improve over 80% on CIFAR-10 (no augmentation) 😄