Command to train Persona-Chat baseline seq2seq model
See original GitHub issueAfter looking at the personachat directory. I was wondering what command to use to train the seq2seq model using ParlAI. It looks like there’s a different seq2seq model being used. And some colleagues mentioned they tried to train with the default parlai seq2seq using the options from the paper and ran into out of memory errors until they reduced the batch size.
The question is, what command would you recommend using to replicate the baseline
eg: python examples/train_model.py -t babi:task10k:1 -m seq2seq -mf /tmp/model_s2s -bs 32 -vtim 30 -vcut 0.95
Apologies that this sounds like such a lazy question and if there’s an answer already that I missed but hopefully this will be of interest to other people as well.
Issue Analytics
- State:
- Created 5 years ago
- Comments:8 (8 by maintainers)
Top Results From Across the Web
Seq2seq (Sequence to Sequence) Model with PyTorch - Guru99
The training process in Seq2seq models is starts with converting each pair of sentences into Tensors from their Lang index. Our sequence to ......
Read more >Convai2 - ParlAI
You can run examples of training on this task in the baselines folder in this directory. For example, you can download and interact...
Read more >parlai - PyPI
Evaluate an IR baseline model on the validation set of the Personachat task: parlai eval_model -m ir_baseline -t personachat -dt valid. Train a...
Read more >Deep Reinforcement Learning for Sequence-to ... - arXiv
In seq2seq models, the input is a sequence of certain data units ... of combining seq2seq training with RL training and to guide...
Read more >NLP From Scratch: Translation with a Sequence to ... - PyTorch
[KEY: > input, = target, < output] > il est en train de peindre un tableau . ... With a seq2seq model the...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Thanks so much for the response. That’s super helpful. I’ll try running more models with different options and see what I can figure out.
The reason I mentioned using a separate encoder vs. token marked encoding was that was the main difference you noted between the personachat specific seq2seq and the main parlai seq2seq. As you said though, there’s more features in the main parlai seq2seq so it’s probably not a proper comparison.
Let me run through those:
__END__
token. the leaderboard is based on the separate eval_ppl script which does a much more careful job of evaluating and doesn’t include these extra special characters from the model, so that the different models can be compared exactly. this ppl tends to be worse (e.g. adding a few points to the valid ppl, I mean of course predicting END is easy!).