MemNN example does not reproduce paper results
See original GitHub issueThe memnn implementation in the examples folder does not seem to reproduce the results reported in the corresponding paper. In particular, when run on task 2 it doesn’t seem to converge to a solution. The implementation was pulled in by #4222.
Platform
- Chainer version: 5.3.0
- CuPy version: running on CPU
- OS/Platform: Linux 4.15.0-38-generic #41-Ubuntu
Code to reproduce
We run the instructions from the readme with task 2 data:
python3 train_memnn.py data/tasks_1-20_v1-2/en/qa2_two-supporting-facts_train.txt data/tasks_1-20_v1-2/en/qa
2_two-supporting-facts_test.txt --model qa2model
Error messages, stack traces, or logs
The output of the above command:
Training data: data/tasks_1-20_v1-2/en/qa2_two-supporting-facts_train.txt: 200
Test data: data/tasks_1-20_v1-2/en/qa2_two-supporting-facts_test.txt: 200
epoch main/loss validation/main/loss main/accuracy validation/main/accuracy
1 3.50365 3.4641 0.15 0.187
2 3.43317 3.36324 0.21 0.187
3 3.30038 3.17261 0.21 0.187
4 3.05496 2.84275 0.21 0.187
5 2.66537 2.39886 0.21 0.187
6 2.25085 2.06654 0.21 0.187
7 1.99869 1.91604 0.21 0.187
8 1.88459 1.85773 0.21 0.187
9 1.83436 1.84326 0.21 0.187
--------
83 1.15372 2.02805 0.583 0.25
84 1.14651 2.03602 0.592 0.254
85 1.14041 2.05358 0.594 0.255
86 1.13725 2.06104 0.596 0.255
87 1.13028 2.06745 0.602 0.255
88 1.1207 2.07252 0.597 0.257
89 1.11511 2.0785 0.598 0.251
90 1.10864 2.08676 0.602 0.257
91 1.1021 2.10154 0.604 0.254
92 1.09391 2.10309 0.603 0.256
93 1.09071 2.11311 0.605 0.26
94 1.08164 2.12224 0.612 0.255
95 1.07694 2.13073 0.617 0.261
96 1.07041 2.14001 0.617 0.26
97 1.06327 2.14919 0.619 0.266
98 1.05924 2.16001 0.629 0.261
99 1.05256 2.16606 0.629 0.27
100 1.04679 2.17209 0.638 0.261
At 100 epochs the model seems to have just overfitted on the task although paper reference implementation without linear start converges to a solution.
We run the command 10 times and obtain the same behaviour each time. Is there a test, checked results for the example implementation of the memory network on the bAbI dataset?
Issue Analytics
- State:
- Created 5 years ago
- Reactions:1
- Comments:11 (5 by maintainers)
Top Results From Across the Web
Why Can't I Reproduce Their Results?
If an author doesn't focus on a parameter in their paper it could be because they don't understand it and it requires a...
Read more >5 – Reproducibility – Machine Learning Blog | ML@CMU
Reproducing a study is a common practice for most researchers as this can help confirm research findings, get inspiration from others' work, and ......
Read more >Fail to reproduce results in a published paper. Is this an ...
I've downloaded the source code and data published in a paper. I've followed their methods, in particularly, copied and pasted their sample code ......
Read more >Reproducibility and Research Integrity - PMC - NCBI - NIH
Some researchers may decide to retract a paper or publish an expression of concern if its results cannot be reproduced. For example, Casedevall...
Read more >I can't reproduce the results of your paper through ... - GitHub
I run example/train_nyt10_pcnn_att.py, but the result are not up to those reported in this paper "Neural Relation Extraction with Selective ...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
I’ve reproduced your result, will look into it further.
This issue is closed as announced. Feel free to re-open it if needed.