Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Extending Encoder Decoder to GPT-2

See original GitHub issue

Adding GPT2 initialization for EncoderDecoder model as pointed out in the issue below.

Currently, only Bert works as a decoder. We might add GPT2 in a couple of weeks. Note that no model has cross-attention layers if it is not already an encoder-decoder model (like Bart or T5) and in this case it does not make sense to use the encoder-decoder wrapper. The model is initialized with random weights for the cross attention layers which will have to be fine-tuned. I agree, that this should be made clearer in the documentation!

_Originally posted by @patrickvonplaten in https://github.com/huggingface/transformers/issues/4517#issuecomment-638058577_

Issue Analytics

State:
Created 3 years ago
Reactions:1
Comments:14 (5 by maintainers)

Top GitHub Comments

4reactions

djw1809commented, Jul 8, 2020

Got sidetracked with other research - coming back to it in several days, working on my end, just need to play nice with the rest of the repo.

On Tue, Jul 7, 2020 at 3:32 PM Mihai Ilie notifications@github.com wrote:

@patrickvonplaten https://github.com/patrickvonplaten Hello Patrick, I am watching with much interest EncodeDecoder from transformers 😃 . Any updates on supporting GPT2 with EncodeDecoder ?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/huggingface/transformers/issues/4961#issuecomment-655170674, or unsubscribe https://github.com/notifications/unsubscribe-auth/AG3PODZXYPBB33F4CBSNZLDR2OO7VANCNFSM4N4QTZQA .

– Dylan Weber, Research Assistant | PhD Candidate School of Math and Statistical Sciences WXLR642/BYENG593 Arizona State University

4reactions

patrickvonplatencommented, Jun 12, 2020

It’s on the roadmap 😃