Is it possible to use AutoregressiveWrapper in combination with ContinuousTransformerWrapper
See original GitHub issueCan I combine the AutoregressiveWrapper with the ContinuousTransformerWrapper? ignore_index and pad_value are scalars and should be tensors here I believe.
Should I create a custom ContinuousAutoregressiveWrapper for this?
I’m trying to use it like this:
model = AutoregressiveWrapper(ContinuousTransformerWrapper(
max_seq_len=self.max_sequence_length,
dim_in=self.vector_dimension,
dim_out=self.vector_dimension,
emb_dim=self.embedding_dimension,
use_pos_emb=True,
attn_layers=Decoder(
dim=self.embedding_dimension,
depth=self.depth,
heads=self.heads,
attn_dropout=self.dropout,
ff_dropout=self.dropout,
rotary_pos_emb=True
)
),
pad_value=[0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0] # For example
).to(utils.get_device())
Issue Analytics
- State:
- Created 2 years ago
- Reactions:1
- Comments:14 (8 by maintainers)
Top Results From Across the Web
Summary of the models - Hugging Face
An autoregressive transformer model with lots of tricks to reduce memory footprint and compute time. Those tricks include: Use Axial position encoding (see ......
Read more >Combining Bidirectional and Auto-Regressive Transformers
Slides: https://sebastianraschka.com/pdf/lecture-notes/stat453ss21/L19_seq2seq_rnn-transformers__slides.pdf0:00 Introduction0:33 BART.
Read more >The Image Local Autoregressive Transformer
Empirically, we introduce several locally guidance tasks, including pose-guided image generation and face editing tasks; and extensive experiments are conducted ...
Read more >Issues - x-transformers - lucidrains - Geeks
Can the continous transformer autoregressive wrapper help with pre-training on ... Is it possible to use AutoregressiveWrapper in combination with ...
Read more >A Pyramid Semi-Autoregressive Transformer with Rich ... - MDPI
For the probability distribution obtained during inference, we use greedy search to take the possibility of the maximum value from probability ...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
It works! I trained a model on the midi of “Californication” converted to encodings by the auto-encoder, and it actually learns to generate the song 😄 I’m curious how it will do when trained on more data.
very cool! the faceformer paper has some interesting twists on the ALiBi encoding too to make their deep net work. maybe you can draw some inspiration from their architecture if you get stuck