Increasingly negative loss in variational autoencoder: is it normal?
See original GitHub issueHi, not sure if it’s an issue. I am training the variational autoencoder with a different set of images with 3 color channels. I am getting an increasingly negative loss. I wonder: is this a normal or valid outcome or is it a bug?
Shouldn’t the loss value be always a positive amount? I am worried because the search for a minimum of the loss function might not get to anything if the loss is not bounded by 0.
Building model and compiling functions...
L = 2, z_dim = 1, n_hid = 3, binary=True
Starting training...
Epoch 1 of 300 took 36.576s
training loss: 1193603.765134
validation loss: 358401.526396
Epoch 2 of 300 took 34.345s
training loss: 170094.748865
validation loss: -990985.720292
Epoch 3 of 300 took 34.682s
training loss: -948598.243076
validation loss: -2374793.240720
Epoch 4 of 300 took 33.571s
training loss: -2179357.580108
validation loss: -3822347.805930
Epoch 5 of 300 took 36.031s
training loss: -3293897.853456
validation loss: -5299324.057571
I tried with 3 or 1024 hidden units, and z dimension being either 1 or 2, but the result is the same. Using the regular MNIST I have no issues: the loss value is positive and decreasing toward 0.
Issue Analytics
- State:
- Created 7 years ago
- Reactions:4
- Comments:7 (2 by maintainers)
Top Results From Across the Web
Variational AutoEncoder giving negative loss
I'm learning about variational autoencoders and I've implemented a simple example in keras, model summary below. I've copied the loss ...
Read more >Keras autoencoder negative loss and val_loss with data in ...
everything works fine, but with my data that are in range [-1,1] I only see negative losses and 0.0000 accuracy while training.
Read more >Tutorial - What is a variational autoencoder? - Jaan Altosaar
In neural net language, a variational autoencoder consists of an encoder, a decoder, and a loss function. The encoder compresses data into a...
Read more >Variational Autoencoders
We want to maximize the variational lower bound. Which means the negative of this is our loss function. is just a deterministic function...
Read more >L17.4 Variational Autoencoder Loss Function - YouTube
Your browser can't play this video. Learn more. Switch camera.
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Please have a closer look at which loss function you’re using. I guess that for MNIST, it will use binary cross-entropy, which requires values (both predictions and targets) to be between 0 and 1. You may want to try mean-squared error instead. Also take care of the network’s output nonlinearity, for MNIST it might be sigmoid, producing outputs in (0,1). You might also be successful with scaling the input data to be in [-1,1] and using tanh outputs (as they do in the DCGAN paper, for example).
This will do it,
Subtracting the by
mean
has the effect of shifting the mean to be 0 of the resulting array, whilst dividing bystd_dev
scales the standard deviation (and variance) to be in the range 0-1.