Unable to reproduce val results
See original GitHub issueHi @rohitgirdhar, I’m trying to test the irCSN-152 (IG65M) model for EK-55. I used the model https://dl.fbaipublicfiles.com/avt/checkpoints/expts/10_ek55_avt_ig65m.txt/0/checkpoint.pth and the config expts/10_ek55_avt_ig65m.txt
, and added these lines to the config:
test_only=true
train.init_from_model=[[${cwd}/DATA/models/10_ek55_avt_ig65m.pth]]
However, I’m getting
[2021-10-05 12:37:04,999][root][INFO] - Reading from resfiles
[2021-10-05 12:37:11,072][func.train][INFO] - []
[2021-10-05 12:37:11,073][root][INFO] - iter_time: 0.294328
[2021-10-05 12:37:11,073][root][INFO] - data_time: 0.135377
[2021-10-05 12:37:11,074][root][INFO] - loss: 6.164686
[2021-10-05 12:37:11,074][root][INFO] - acc1/action: 7.351763
[2021-10-05 12:37:11,074][root][INFO] - acc5/action: 19.931891
[2021-10-05 12:37:11,074][root][INFO] - cls_action: 6.134162
[2021-10-05 12:37:11,074][root][INFO] - feat: 0.030524
which is far from the 14.4 and 31.7 Top 1/5 performance. Do you know what might be wrong here?
Issue Analytics
- State:
- Created 2 years ago
- Comments:7 (4 by maintainers)
Top Results From Across the Web
I can't reproduce the result. · Issue #19 · TRI-ML/dd3d · GitHub
Hi, The backbone is DLA34. The GPU number is 2 and per-GPU bachsize is 2.The result is below.
Read more >Unable to reproduce PyTorch tutorial results - PyTorch Forums
Unable to reproduce PyTorch tutorial results ... I ain't able to reproduce even while running the code on CPU. Training complete in 3m...
Read more >Why can't I get reproducible results in Keras even though I set ...
Set `python` built-in pseudo-random generator at a fixed value import ... The key point of making result reproducible is to disable GPU.
Read more >How to Get Reproducible Results with Keras
It is possible that because of the sophistication of your model and the parallel nature of training, that you are getting unreproducible results...
Read more >Unable to reproduce benchmark results mentioned in the paper - #3 ...
As part of our submission pipeline for the challenge, I'm trying to reproduce the results mentioned in the paper. With AlexNet trained on...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Great! The configs should run with a 16GB GPU. From my initial experiments I found that more heads/layers for EK55 did help in getting better performance. You can try with fewer though the performance might be a bit lower. Closing this task, but feel free to open another task if you face any other issues.
Hmm that is strange. It seems then the problem might be with the IG65M features. Can you try re-downloading the LMDB file? I have already tried it with a fresh download of the LMDB file and it seems to work. And could you also try with the Epic Kitchens-100 IG65M LMDB file and try that experiment?
Btw for the AR numbers, I actually don’t print them in the logs at the end, however they should be in the tensorboard files. So you can just run tensorboard on the output directory and see the AR5 numbers.