Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Implementing Seq2Seq Models

See original GitHub issue

A number of us have shown interest in implementing seq2seq models in Keras #5358, so I’m creating a central issue for it. We can discuss design decisions and prevent duplication of work here.

There are a number of existing pull requests related to seq2seq models:

#5559 (Merged) Specify States of RNN Layers Symbolically.
- We have to specify the initial state of the decoder symbolically. This pull request adds the initial_state argument, which is either a tensor or list of tensors, to Recurrent.call.
#5731 Add return_state flag to RNNs for seq2seq modeling.
- We have to retrieve the final state of encoder. This pull request adds the return_state flag to Recurrent.__init__. If True, the output of the RNN will be a list of tensors. The first tensor is the output, and the remaining tensors are the final state.
#5737 Implement RNN decoding; feed output t-1 as input t
- During test time, we have to feed the output of the decoder back into itself as the input for the next time step. This pull request adds the output_length argument to Recurrent.__init__. It requires unroll=True. If output_length is greater than the length of the input, the output of that layer will be fed back into the RNN as the input for the next time step.
- Note: During training time, we can use teacher forcing, instead of feeding back the outputs of the decoder.

There is currently a seq2seq example in keras/examples: addition_rnn.py. However, the implementation could be greatly improved. It currently supports only a single layer encoder-decoder architecture. It also does not perform teacher forcing.

Feel free to comment on any ideas, questions or comments you might have.

@farizrahman4u @fchollet @israelg99.

Issue Analytics

State:
Created 7 years ago
Reactions:6
Comments:14 (6 by maintainers)

Top GitHub Comments

10reactions

fcholletcommented, Mar 13, 2017

I will check out the two PRs after we are done releasing Keras 2. Looking forward to reading everyone’s feedback on how to best handle seq2seq models, API-wise. The first PR was a success (symbolic states).

5reactions

jinfagangcommented, Jul 1, 2017

Has official keras contains seq2seq implementation now?