question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

How to train a custom seq2seq model with BertModel

See original GitHub issue

How to train a custom seq2seq model with BertModel,

I would like to use some Chinese pretrained model base on BertModel

so I’ve tried using Encoder-Decoder Model, but it seems theEncoder-Decoder Model is not used for conditional text generation

and I saw that BartModel seems to be the model I need, but I cannot load pretrained BertModel weight with BartModel.

by the way, could I finetune a BartModel for seq2seq with custom data ?

any suggestion, thanks

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Reactions:1
  • Comments:30 (14 by maintainers)

github_iconTop GitHub Comments

7reactions
patrickvonplatencommented, May 24, 2020

Hi @chenjunweii - thanks for your issue! I will take a deeper look at the EncoderDecoder framework at the end of this week and should add a google colab on how to fine-tune it.

6reactions
patrickvonplatencommented, Jul 15, 2020

Yeah, the code is ready in this PR: https://github.com/huggingface/transformers/tree/more_general_trainer_metric . The script to train an Encoder-Decoder model can be assessed here: https://github.com/huggingface/transformers/blob/more_general_trainer_metric/src/transformers/bert_encoder_decoder_summary.py

And in order for the script to work, you need to use this Trainer class: https://github.com/huggingface/transformers/blob/more_general_trainer_metric/src/transformers/trainer.py

I’m currently training the model myself. When the results are decent, I will publish a little notebook.

Read more comments on GitHub >

github_iconTop Results From Across the Web

How To Train a Seq2Seq Summarization Model Using “BERT ...
In the following example, we use BERT-base as both encoder and decoder. Code 1. The code to load the pre-trained model. Since the...
Read more >
BertGeneration - Hugging Face
The BertGeneration model is a BERT model that can be leveraged for sequence-to-sequence tasks using EncoderDecoderModel as proposed in Leveraging ...
Read more >
Seq2Seq Model - Simple Transformers
The following rules currently apply to generic Encoder-Decoder models (does not apply to BART and Marian):. The decoder must be a bert model....
Read more >
Headliner — Easy training and deployment of seq2seq models
The authors of the BertSum paper made two key adjustments to the BERT model, first a customized data preprocessing and second, a specific ......
Read more >
HuggingFace Finetuning Seq2Seq Transformer Model Coding ...
In this video, we're going to finetune a t-5 model using HuggingFace to solve a seq2seq problem.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found