PerceiverResampler missing some LayerNorms?
See original GitHub issueHey, it feels like PerceiverResampler is missing some LayerNorms?
it seems to me we should layer-norm x
before sending to attentions loop, and may be add layer-norm to ff(latents) + latents
?
Issue Analytics
- State:
- Created a year ago
- Reactions:3
- Comments:7 (5 by maintainers)
Top Results From Across the Web
How are Learned Latent Arrays for the Perceiver Resampler in ...
My understanding is that these learned latent arrays are a reduced dimensional representation of the visual feature arrays that are the outputs ...
Read more >ICLR 2022 Conference - OpenReview
... target variables given feature vectors with some of the elements missing. ... Our model augments the Perceiver with a flexible querying mechanism...
Read more >lucidrains/perceiver-pytorch: Implementation of ... - GitHub
Implementation of Perceiver, General Perception with Iterative Attention, in Pytorch - GitHub ... As an example, using PerceiverIO as a language model.
Read more >The Annotated Perceiver. A detailed PyTorch tutorial for the…
In this article I will explain DeepMind's Perceiver architecture and provide a thoroughly annotated working implementation. This is intended to be Part 1...
Read more >LayerNorm — PyTorch 1.13 documentation
LayerNorm (normalized_shape, eps=1e-05, elementwise_affine=True, device=None, ... Applies Layer Normalization over a mini-batch of inputs as described in the ...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Indeed I’ll add it soon
Lol, just finished building it
I am talking about
x
here: https://github.com/lucidrains/flamingo-pytorch/blob/932d2a0997b436d7c6c91d29c92d989d4fb80000/flamingo_pytorch/flamingo_pytorch.py#L104you pass it later to different attention layers without it being modified so its probably make sense to simply norm it once after adding
time_pos_emb