seq2seq not able to replicate results
See original GitHub issueHi
I am trying to get running a simple seq2seq model with decent results on opensubtitles. I ran the below command on a 12GB GPU Ram Nvidia for 15 hours but the results are not as I am expecting, I was expecting results like Neural Conversational Model paper (1506.05869)
python3.6 examples/train_model.py -e 13 -m seq2seq -mf godz7 -t opensubtitles -dt train:stream -hist 1 -bs 32 -tr 100 --dict-maxexs=10000000 --gpu 2 --batch-sort false -hs 500 -esz 500 -nl 2 -emb glove -att general -dr 0.3 -lr 0.001 -clip 5
I have tried different variants of the hidden size from [2048, 1024, 512] and similarly embedding size with tradeoff in batch size so that RAM capacity is not crossed. Also tried the default options which come with default seq2seq but results are not good. Any tips on where I may be going wrong?
Sample results like-
Enter Your Message: hi there buddy
prediction: hypothalamus
[Seq2Seq]: hypothalamus
Enter Your Message: ok maybe something better?
[Seq2Seq]: dogged
Enter Your Message: why are these 1 word
prediction: nineteen
[Seq2Seq]: nineteen
Enter Your Message: and why is it not multi
prediction: llttie
[Seq2Seq]: llttie
Enter Your Message: ok anyways
prediction: bunting
[Seq2Seq]: bunting
Enter Your Message: i missed it
prediction: 7OO
[Seq2Seq]: 7OO
Enter Your Message: is this going to work
prediction: interviewee
[Seq2Seq]: interviewee
Enter Your Message: i guess its just not
[Seq2Seq]: interviewee
Enter Your Message: huh is that repeating
prediction: Simpson
[Seq2Seq]: Simpson
Issue Analytics
- State:
- Created 6 years ago
- Comments:30 (25 by maintainers)
Top Results From Across the Web
What to do when Seq2Seq network repeats words over and ...
To test this hypothesis - try to train on smaller dataset and see if it generalise (produce meaningful results).
Read more >Seq2Seq model produces repeating words
But, the thing is, after a few epochs training, the model cannot produce any meaningful output, instead it keeps generating repeating words. The ......
Read more >Encoder-Decoder Seq2Seq Models, Clearly Explained!!
Sequence-to-Sequence (Seq2Seq) problems is a special class of Sequence Modelling Problems in which both, the input and the output is a sequence.
Read more >10.7. Encoder-Decoder Seq2Seq for Machine Translation
In so-called seq2seq problems like machine translation (as discussed in Section 10.5), where inputs and outputs both consist of variable-length unaligned ...
Read more >How to Implement Seq2seq Model - Cnvrg.io
The algorithm showed some promising results in tasks like machine translation ... Therefore the model will not be able predict elements that correspond...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
@Jackberg I got somewhat decent results using the language_model in ParlAI, not seq2seq. The hyperparameters I used for that are
-vtim 360 -esz 200 -hs 500 -nl 2 -lr 10 -bs 20
(and this is on the Twitter datset, currently training some on the new opensubtitles)@emilydinan @alexholdenmiller Great! Looking forward to your recipe for seq2seq model!