question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Does LambdaLayer need BatchNorm and activation after it?

See original GitHub issue

Hello, I’m trying to reproduce of this. I’m build the LambdaResnet,a little question is BatchNorm and activation are needed after this(LambdaLayer)? Thanks.

Issue Analytics

  • State:open
  • Created 3 years ago
  • Reactions:3
  • Comments:7 (3 by maintainers)

github_iconTop GitHub Comments

13reactions
PistonYcommented, Oct 21, 2020

Hi @lucidrains, thanks for reply. I tested them all, I think you’re right. When apply bn+relu the val accuracy doesn’t grow. This is my final implement.

Now I’m training LambdaResnet50,it’s looking good: I use the standard training step in my project same with Resnet50 except batch_size set to 64.

Epoch 28 result: Train Acc:0.44 Loss:2.54 Val Acc:0.48 Loss:3.1748e+08 Some observations are:

  1. parameters and GFLOPs are small but training speed and gpu memory cost are still high.
  2. Mix-Precision(FP16) training by torch.cuda.amp would make train loss nan, that’s make me could only train with 64 batch_size.
  3. Val loss is strange.
  4. Convergence is much slower than Resnet50.
3reactions
PistonYcommented, Oct 26, 2020

@lucidrains Unfortunately, I only got 76.1 best top1 on val set(79.2 on train set). I’d better wait author release their code.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Batch Norm Explained Visually — How it works, and why ...
Calculate the normalized values for each activation feature vector using the corresponding mean and variance. These normalized values now have ...
Read more >
Ordering of batch normalization and dropout? - Stack Overflow
So the Batch Normalization Layer is actually inserted right after a Conv ... As far as dropout goes, I believe dropout is applied...
Read more >
Batch normalization and the need for bias in neural networks
i.e. each activation is shifted by its own shift parameter (beta). So yes, the batch normalization eliminates the need for a bias vector....
Read more >
[D] Batch Normalization before or after ReLU? - Reddit
BN after activation will normalize the positive features without ... It seems a lot of folks have false notions about BatchNorm.
Read more >
Where should I place the batch normalization layer(s)?
thinking: Just before or after the activation function layer? ... @shirui-japina In general, Batch Norm layer is usually added before ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found