hit nan for variance_normalized
See original GitHub issueNot certain this is a bug yet, but I’m getting this rarely after awhile of training and am not finding an issue in my side. Input to loss function looks good (no nan’s). I’m working with a fairly complex loss function though, so very possible I have a rare bug in my code.
I’m using the following options
Ranger21(
params=params, lr=3e-4,
num_epochs=1e12, num_batches_per_epoch=1, num_warmup_iterations=1000,
using_gc=True, weight_decay=1e-4, use_madgrad=True
)
I’ve seen this with a batch size of 4-128 so far, so doesn’t seem to be dependent on that.
Issue Analytics
- State:
- Created 2 years ago
- Comments:7 (2 by maintainers)
Top Results From Across the Web
Why would someone plot variance normalized by the mean?
I'm reading a scientific paper where they plot the variance of particle intensity normalized by the mean of particle intensity.
Read more >Normalized measures of spread
Normalized measures of spread are calculated by dividing a measure of spread (except the variance because it has squared units) by a measure...
Read more >Normalization and variance stabilization of single-cell RNA ...
The variance of a normalized gene (across cells) should primarily reflect biological heterogeneity, independent of gene abundance or sequencing ...
Read more >A systematic evaluation of normalization methods in ...
We found that variance stabilization normalization (Vsn) reduced variation the most between technical replicates in all examined data sets.
Read more >Normalization layer
A preprocessing layer which normalizes continuous features. This layer will shift and scale inputs into a distribution centered around 0 with standard ...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Hi @jimmiebtlr - I’ve just checked in a softplus transform update along with gradient normalization, that should be helpful here as it boosts very small values, and set a new high on our small benchmark dataset with it.
I’ll give that a shot, it takes some time to reproduce, so will post back when I find one way or another. Thanks!