Encoder/Decoder generation
See original GitHub issueHello! I tried to train a Bert2Bert model for QA generation, however when I try the generate function it returns gibberish. I also tried using the example code below, and that also generated gibberish(the output is “[PAD] leon leon leon leon leonieieieieie shall shall shall shall shall shall shall shall shall”). Is the generate function supposed to work for EncoderDecoder models, and what am I doing wrong?
from transformers import EncoderDecoderModel, BertTokenizer
import torch
tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')
model = EncoderDecoderModel.from_encoder_decoder_pretrained('bert-base-uncased', 'bert-base-uncased') # initialize Bert2Bert
generated = model.generate(input_ids, decoder_start_token_id=model.config.decoder.pad_token_id)
Issue Analytics
- State:
- Created 3 years ago
- Comments:6 (4 by maintainers)
Top Results From Across the Web
Understanding Encoder-Decoder Sequence to Sequence Model
Video captioning — a 2015 paper shows how a seq2seq yields great results on generating movie descriptions. These are only some applications ...
Read more >Encoder Decoder Models - Hugging Face
The EncoderDecoderModel can be used to initialize a sequence-to-sequence model with any pretrained autoencoding model as the encoder and any pretrained ...
Read more >Encoder-Decoder Seq2Seq Models, Clearly Explained!!
Image Captioning is the process of generating a textual description of an image. In the model proposed by them, the image is fed...
Read more >Encoder-Decoder Models for Natural Language Processing
Encoder -Decoder models and Recurrent Neural Networks are probably the most natural way to represent text sequences.
Read more >Encoder-Decoder Recurrent Neural Network Models for ...
An Encoder-Decoder architecture was developed where an input sequence was read in entirety and encoded to a fixed-length internal representation ...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
If you saved your model using
.save_pretrained
then you can load it using just.from_pretrained
as you load any other HF model. Just pass the path of your saved model. You won’t need to use.from_encoder_decoder_pretrained
Why do you save the encoder and decoder model seperately?:
This line:
should be enough.
We moved away from saving the model to two separate folders, see: https://github.com/huggingface/transformers/pull/3383. Also the docs: https://huggingface.co/transformers/model_doc/encoderdecoder.html might be useful.