Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Does LambdaLayer need BatchNorm and activation after it?

See original GitHub issue

Hello, I’m trying to reproduce of this. I’m build the LambdaResnet,a little question is BatchNorm and activation are needed after this(LambdaLayer)? Thanks.

Issue Analytics

State:
Created 3 years ago
Reactions:3
Comments:7 (3 by maintainers)

Top GitHub Comments

13reactions

PistonYcommented, Oct 21, 2020

Hi @lucidrains, thanks for reply. I tested them all, I think you’re right. When apply bn+relu the val accuracy doesn’t grow. This is my final implement.

Now I’m training LambdaResnet50,it’s looking good: I use the standard training step in my project same with Resnet50 except batch_size set to 64.

Epoch 28 result: Train Acc:0.44 Loss:2.54 Val Acc:0.48 Loss:3.1748e+08 Some observations are:

parameters and GFLOPs are small but training speed and gpu memory cost are still high.
Mix-Precision(FP16) training by torch.cuda.amp would make train loss nan, that’s make me could only train with 64 batch_size.
Val loss is strange.
Convergence is much slower than Resnet50.

3reactions

PistonYcommented, Oct 26, 2020

@lucidrains Unfortunately, I only got 76.1 best top1 on val set(79.2 on train set). I’d better wait author release their code.

Top Results From Across the Web

Batch Norm Explained Visually — How it works, and why ...

Calculate the normalized values for each activation feature vector using the corresponding mean and variance. These normalized values now have ...

Ordering of batch normalization and dropout? - Stack Overflow

So the Batch Normalization Layer is actually inserted right after a Conv ... As far as dropout goes, I believe dropout is applied...

Batch normalization and the need for bias in neural networks

i.e. each activation is shifted by its own shift parameter (beta). So yes, the batch normalization eliminates the need for a bias vector....

[D] Batch Normalization before or after ReLU? - Reddit

BN after activation will normalize the positive features without ... It seems a lot of folks have false notions about BatchNorm.

Where should I place the batch normalization layer(s)?

thinking: Just before or after the activation function layer? ... @shirui-japina In general, Batch Norm layer is usually added before ...

Troubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.

Start Free

Top Related Reddit Thread

No results found

Top Related Tweet

No results found

Top Related Dev.to Post

No results found

Does LambdaLayer need BatchNorm and activation after it?

Issue Analytics

Top GitHub Comments

Top Results From Across the Web

Top Related Medium Post

Top Related StackOverflow Question

Troubleshoot Live Code

Top Related Reddit Thread

Top Related Hackernoon Post

Top Related Tweet

Top Related Dev.to Post

Top Related Hashnode Post

Saving Images

Contiguity problem: "RuntimeError: cuDNN error: CUDNN_STATUS_NOT_SUPPORTED. This error may appear if you passed in a non-contiguous input."