question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

  1. There’s a bug when using attention layer. In this line: https://github.com/facebookresearch/ParlAI/blob/55fcf6127309f3c0e2f15c1fe6eae1fd71afcbcb/parlai/agents/seq2seq/modules.py#L80 new hidden states are returned, but never used for getting next prediction. This is the reason why attention model performs extremely bad. Here’s the result for just 30 mins training:
TEXT:  I get to read the articles of extradition acordind to the European Court of human rights .
PREDICTION:  i was just a little bit of a lot of people .
~
TEXT:  Yes , you are the very monster I created
PREDICTION:  i will be a good thing
~
TEXT:  Hello , detective Spooner .
PREDICTION:  i don' t know .
~
TEXT:  I' m a tiger .
PREDICTION:  i don' t know .
~
TEXT:  What' ve you got ?
PREDICTION:  i don' t know .
~
TEXT:  We are going to change the way we see the road .
PREDICTION:  i don' t know what you are .

What’s more, attention model (using local for Twitter and general for Opensubtitles) can really make loss lower.

  1. The default value of lookuptable https://github.com/facebookresearch/ParlAI/blob/55fcf6127309f3c0e2f15c1fe6eae1fd71afcbcb/parlai/agents/seq2seq/seq2seq.py#L107 will cause much more memory usage, but I didn’t find out the reason. Old value all works fine.

  2. In this line of vectorize() function, https://github.com/facebookresearch/ParlAI/blob/55fcf6127309f3c0e2f15c1fe6eae1fd71afcbcb/parlai/agents/seq2seq/seq2seq.py#L403 it only returns 6 values, but newer version needs 7.

Issue Analytics

  • State:closed
  • Created 6 years ago
  • Comments:6 (6 by maintainers)

github_iconTop GitHub Comments

1reaction
alexholdenmillercommented, Feb 26, 2018
  1. wow, thanks. that was some copypasta from the line above it but was really hurting training with attention. thanks for the catch.

  2. unique uses 3x more memory than all, intentionally. all shares the same tensor for the weight of the encoder Embedding layer, the decoder Embedding layer, and the final Linear layer producing an output token. unique keeps them separate, and enc_dec and dec_out share the mentioned pairs.

  3. fixing, thanks.

0reactions
ShaojieJiangcommented, Feb 28, 2018

Hi @alexholdenmiller , that’s great! I tried that implementation before, but the memory leak bug made it less fascinating to me.

Read more comments on GitHub >

github_iconTop Results From Across the Web

What Is A Bug Report? The Essential Guide + Examples Of ...
A bug report is something that stores all information needed to document, report and fix problems occurred in software or on a website....
Read more >
Capture and read bug reports - Android Developers
A bug report contains device logs, stack traces, and other diagnostic information to help you find and fix bugs in your app.
Read more >
How to Write A Good Bug Report? Tips and Tricks
Bug reporting is an important aspect of Software Testing. Effective Bug reports communicate well with the development team to avoid confusion or ...
Read more >
14 Bug Reporting Templates You Can Copy for Your QA ...
Check out these 14 super actionable bug report templates, tailored for your issue tracker like Jira, GitHub, Trello, Asana, Excel and more.
Read more >
Bug Reporting - Apple Developer
Now with Feedback Assistant available on iPhone, iPad, Mac, and the web, it's easier to submit effective bug reports and request enhancements to...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found