FusedLayerNorm vs torch.nn.LayerNorm
See original GitHub issueWhat’s the advantage of using the FusedLayerNorm
over torch.nn.LayerNorm
? I’m running into an issue with using TorchScript and I’m wondering if I can replace the former with the latter.
The deeper question is: Is the apex version of layer norm significantly optimized over the standard pytorch version or is it simply a legacy of when pytorch did not have a built in layer norm function?
Issue Analytics
- State:
- Created 4 years ago
- Reactions:1
- Comments:13 (1 by maintainers)
Top Results From Across the Web
LayerNorm — PyTorch 1.13 documentation
This layer uses statistics computed from input data in both training and evaluation modes. Parameters: normalized_shape (int or list or torch.Size) –. input ......
Read more >Fused LayerNorm - Composer
Fused LayerNorm is implemented by performing model surgery, which looks for instances of torch.nn.LayerNorm and replaces them with a apex.normalization.
Read more >apex.normalization.fused_layer_norm - GitHub Pages
This layer uses statistics computed from input data in both training and evaluation modes. Parameters. normalized_shape (int or list or torch.Size) –. input ......
Read more >Layer Normalization in Pytorch (With Examples) - Wandb
To do so, you can use torch.nn.LayerNorm(). For convolutional neural networks however, one also needs to calculate ...
Read more >Source code for fairseq.modules.layer_norm
... source tree. import torch import torch.nn as nn import torch.nn.functional as F try: from apex.normalization import FusedLayerNorm as _FusedLayerNorm ...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
I observe the same issue as @ngoyal2707 on PyTorch 1.5 – torch.nn.LayerNorm is slower than apex.FusedLayerNorm for shapes typical in NLP models. For example: (512, 16, 1024) with normalization over the last dimension is slower using torch.nn.LayerNorm.
@hitvoice 2080 TI and apex from master branch at the time of my precedent message, so June 16th