question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

loss increasing with larger input range

See original GitHub issue

When I multiply the input by sqrt(512), loss always increases, on the order of magnitude of 1e2~1e3 from the beginning of training. i.e., Only change fconv.py:

def Embedding(num_embeddings, embedding_dim, padding_idx):
    m = nn.Embedding(num_embeddings, embedding_dim, padding_idx=padding_idx)
    m.weight.data.normal_(0, 0.1)
    m.weight.data.mul_(math.sqrt(embedding_dim))  # I add it here
    return m

I thought the operation above only change the input’s magnitude from 1e-4 to 1e-2, can you tell me why loss exploding? PS: I’m sure loss decreases normally without mul_.


Here is part of the log:

| epoch 001:   0%|                                         | 13/14254 [00:19<5:38:48,  1.43s/it, loss=1075.60 (897.77), wps=4021, wpb=5703, bsz=131, lr=0.25, clip=100%, gnorm=1689590071670.1538]
| epoch 001:   1%|3                                      | 114/14254 [02:44<5:46:16,  1.47s/it, loss=1859.97 (1038.92), wps=3977, wpb=5695, bsz=169, lr=0.25, clip=100%, gnorm=7599585518747.3330]

Issue Analytics

  • State:closed
  • Created 6 years ago
  • Comments:6 (4 by maintainers)

github_iconTop GitHub Comments

1reaction
michaelaulicommented, Nov 5, 2017

Those values worked well based on cross validation experiments.

0reactions
Zrachelcommented, Nov 1, 2017

Thank you. Is there any explanation on why input should be in a small range (normal(0, 0.1)) instead of normal(0, 1)?

Read more comments on GitHub >

github_iconTop Results From Across the Web

Good accuracy despite high loss value - Cross Validated
I have 5 points, and for example input -1 has lead to output 0. ... The cross entropy is rising, the selected a...
Read more >
Possible explanations for loss increasing? - Stack Overflow
What are the possible explanations for my loss increasing like this? My initial learning rate is set very low: 1e-6, but I've tried...
Read more >
Interpreting Loss Curves | Machine Learning
A large increase in loss is typically caused by anomalous values in input data. Possible causes are: NaNs in input data. Exploding gradient...
Read more >
Loss and Loss Functions for Training Deep Learning Neural ...
Cross-entropy loss is minimized, where smaller values represent a better model than larger values. A model that predicts perfect probabilities ...
Read more >
Loss increasing instead of decreasing - PyTorch Forums
I have a GRU layer and a fully connected using a single hidden layer. My inputs are variable sized arrays that were padded...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found