Implementing parameter sharing (Universal Transformers)
See original GitHub issueWhat is your question?
I am trying to set up Universal Transformer in the codebase and wanted some help and insight on it. Basically Universal Transformers are just normal transformer models with parameter sharing across layers.
Code
I am changing the following line to have the same initialized layer within the loop in self.layers.extend
for both the encoder and the decoder. Is that all the change that is required to get it set up because I am not able to get good results with it.
Issue Analytics
- State:
- Created 3 years ago
- Comments:23 (5 by maintainers)
Top Results From Across the Web
Implementing parameter sharing (Universal Transformers)
I am trying to set up Universal Transformer in the codebase and wanted some help and insight on it. Basically Universal Transformers are...
Read more >Lessons on Parameter Sharing across Layers in Transformers
The proposed approach relaxes a widely used technique, which shares parameters for one layer with all layers such as Universal Transformers ...
Read more >Lessons on Parameter Sharing across Layers in Transformers
This work proposes a novel parameter sharing method for Transformers that relaxes a widely used technique, which shares the parameters of ...
Read more >Parameter Sharing Methods for Multilingual Self-Attentional ...
parameter sharing strategies for the Transformer model using MTL, mainly for one-to-many multi- lingual translation. Here, we will use the symbol Θ.
Read more >UNIVERSAL TRANSFORMERS - OpenReview
Then, by applying a transition function. (shared across position and time) to the outputs of the self-attention mechanism, independently at each position. As ......
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Thank you very much!
Thanks @takase! That solved the problem!