question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

MemNN example does not reproduce paper results

See original GitHub issue

The memnn implementation in the examples folder does not seem to reproduce the results reported in the corresponding paper. In particular, when run on task 2 it doesn’t seem to converge to a solution. The implementation was pulled in by #4222.

Platform

  • Chainer version: 5.3.0
  • CuPy version: running on CPU
  • OS/Platform: Linux 4.15.0-38-generic #41-Ubuntu

Code to reproduce

We run the instructions from the readme with task 2 data:

python3 train_memnn.py data/tasks_1-20_v1-2/en/qa2_two-supporting-facts_train.txt data/tasks_1-20_v1-2/en/qa
2_two-supporting-facts_test.txt --model qa2model

Error messages, stack traces, or logs

The output of the above command:

Training data: data/tasks_1-20_v1-2/en/qa2_two-supporting-facts_train.txt: 200                                        
Test data: data/tasks_1-20_v1-2/en/qa2_two-supporting-facts_test.txt: 200                                             
epoch       main/loss   validation/main/loss  main/accuracy  validation/main/accuracy                                 
1           3.50365     3.4641                0.15           0.187                                                    
2           3.43317     3.36324               0.21           0.187                                                    
3           3.30038     3.17261               0.21           0.187                                                    
4           3.05496     2.84275               0.21           0.187                                                    
5           2.66537     2.39886               0.21           0.187                                                    
6           2.25085     2.06654               0.21           0.187                                                    
7           1.99869     1.91604               0.21           0.187
8           1.88459     1.85773               0.21           0.187
9           1.83436     1.84326               0.21           0.187
--------
83          1.15372     2.02805               0.583          0.25
84          1.14651     2.03602               0.592          0.254
85          1.14041     2.05358               0.594          0.255
86          1.13725     2.06104               0.596          0.255
87          1.13028     2.06745               0.602          0.255
88          1.1207      2.07252               0.597          0.257
89          1.11511     2.0785                0.598          0.251
90          1.10864     2.08676               0.602          0.257
91          1.1021      2.10154               0.604          0.254
92          1.09391     2.10309               0.603          0.256
93          1.09071     2.11311               0.605          0.26
94          1.08164     2.12224               0.612          0.255
95          1.07694     2.13073               0.617          0.261
96          1.07041     2.14001               0.617          0.26
97          1.06327     2.14919               0.619          0.266
98          1.05924     2.16001               0.629          0.261
99          1.05256     2.16606               0.629          0.27
100         1.04679     2.17209               0.638          0.261

At 100 epochs the model seems to have just overfitted on the task although paper reference implementation without linear start converges to a solution.

We run the command 10 times and obtain the same behaviour each time. Is there a test, checked results for the example implementation of the memory network on the bAbI dataset?

Issue Analytics

  • State:closed
  • Created 5 years ago
  • Reactions:1
  • Comments:11 (5 by maintainers)

github_iconTop GitHub Comments

1reaction
takagicommented, May 8, 2019

I’ve reproduced your result, will look into it further.

0reactions
stale[bot]commented, Apr 9, 2020

This issue is closed as announced. Feel free to re-open it if needed.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Why Can't I Reproduce Their Results?
If an author doesn't focus on a parameter in their paper it could be because they don't understand it and it requires a...
Read more >
5 – Reproducibility – Machine Learning Blog | ML@CMU
Reproducing a study is a common practice for most researchers as this can help confirm research findings, get inspiration from others' work, and ......
Read more >
Fail to reproduce results in a published paper. Is this an ...
I've downloaded the source code and data published in a paper. I've followed their methods, in particularly, copied and pasted their sample code ......
Read more >
Reproducibility and Research Integrity - PMC - NCBI - NIH
Some researchers may decide to retract a paper or publish an expression of concern if its results cannot be reproduced. For example, Casedevall...
Read more >
I can't reproduce the results of your paper through ... - GitHub
I run example/train_nyt10_pcnn_att.py, but the result are not up to those reported in this paper "Neural Relation Extraction with Selective ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found