Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

MemNN example does not reproduce paper results

See original GitHub issue

The memnn implementation in the examples folder does not seem to reproduce the results reported in the corresponding paper. In particular, when run on task 2 it doesn’t seem to converge to a solution. The implementation was pulled in by #4222.

Platform

Chainer version: 5.3.0
CuPy version: running on CPU
OS/Platform: Linux 4.15.0-38-generic #41-Ubuntu

Code to reproduce

We run the instructions from the readme with task 2 data:

python3 train_memnn.py data/tasks_1-20_v1-2/en/qa2_two-supporting-facts_train.txt data/tasks_1-20_v1-2/en/qa
2_two-supporting-facts_test.txt --model qa2model

Error messages, stack traces, or logs

The output of the above command:

Training data: data/tasks_1-20_v1-2/en/qa2_two-supporting-facts_train.txt: 200                                        
Test data: data/tasks_1-20_v1-2/en/qa2_two-supporting-facts_test.txt: 200                                             
epoch       main/loss   validation/main/loss  main/accuracy  validation/main/accuracy                                 
1           3.50365     3.4641                0.15           0.187                                                    
2           3.43317     3.36324               0.21           0.187                                                    
3           3.30038     3.17261               0.21           0.187                                                    
4           3.05496     2.84275               0.21           0.187                                                    
5           2.66537     2.39886               0.21           0.187                                                    
6           2.25085     2.06654               0.21           0.187                                                    
7           1.99869     1.91604               0.21           0.187
8           1.88459     1.85773               0.21           0.187
9           1.83436     1.84326               0.21           0.187
--------
83          1.15372     2.02805               0.583          0.25
84          1.14651     2.03602               0.592          0.254
85          1.14041     2.05358               0.594          0.255
86          1.13725     2.06104               0.596          0.255
87          1.13028     2.06745               0.602          0.255
88          1.1207      2.07252               0.597          0.257
89          1.11511     2.0785                0.598          0.251
90          1.10864     2.08676               0.602          0.257
91          1.1021      2.10154               0.604          0.254
92          1.09391     2.10309               0.603          0.256
93          1.09071     2.11311               0.605          0.26
94          1.08164     2.12224               0.612          0.255
95          1.07694     2.13073               0.617          0.261
96          1.07041     2.14001               0.617          0.26
97          1.06327     2.14919               0.619          0.266
98          1.05924     2.16001               0.629          0.261
99          1.05256     2.16606               0.629          0.27
100         1.04679     2.17209               0.638          0.261

At 100 epochs the model seems to have just overfitted on the task although paper reference implementation without linear start converges to a solution.

We run the command 10 times and obtain the same behaviour each time. Is there a test, checked results for the example implementation of the memory network on the bAbI dataset?

Issue Analytics

State:
Created 5 years ago
Reactions:1
Comments:11 (5 by maintainers)

Top GitHub Comments

1reaction

takagicommented, May 8, 2019

I’ve reproduced your result, will look into it further.

0reactions

stale[bot]commented, Apr 9, 2020

This issue is closed as announced. Feel free to re-open it if needed.

Top Results From Across the Web

Why Can't I Reproduce Their Results?

If an author doesn't focus on a parameter in their paper it could be because they don't understand it and it requires a...

5 – Reproducibility – Machine Learning Blog | ML@CMU

Reproducing a study is a common practice for most researchers as this can help confirm research findings, get inspiration from others' work, and ......

Fail to reproduce results in a published paper. Is this an ...

I've downloaded the source code and data published in a paper. I've followed their methods, in particularly, copied and pasted their sample code ......

Reproducibility and Research Integrity - PMC - NCBI - NIH

Some researchers may decide to retract a paper or publish an expression of concern if its results cannot be reproduced. For example, Casedevall...

I can't reproduce the results of your paper through ... - GitHub

I run example/train_nyt10_pcnn_att.py, but the result are not up to those reported in this paper "Neural Relation Extraction with Selective ...