question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Different loss values in "model.fit" and "print(the-loss)"

See original GitHub issue

I tried a linear approximation. So I’m faced with different loss values in “model.fit” and just “tf.keras.losses.MeanSquaredError ()”. But I don’t know what I’m doing wrong. Could you please tell me if my source code is bad or if it’s a bug on the keras side?

System information.

  • Have I written custom code (as opposed to using a stock example script provided in Keras): Yes
  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): NAME=“Ubuntu” VERSION=“20.04.3 LTS (Focal Fossa)”
  • TensorFlow installed from (source or binary): pipenv install tensorflow
  • TensorFlow version (use command below): ‘v2.7.0-rc1-69-gc256c071bb2’
  • Python version: Python 3.8.10 (default, Sep 28 2021, 16:10:42)
  • Bazel version (if compiling from source): I don’t know.
  • GPU model and memory: I did not use in this issue.
  • Exact command to reproduce: python main_sub.py

Bellow code is main_sub.py

from tensorflow.keras import layers
import numpy as np
import random
import tensorflow as tf

input_dim = 1
output_dim = 1
batch_size = 5


class DummyGenerator:
    def __init__(self, data_num=100):
        self._data_num = data_num
        self._input = np.array([[(j + 1) * i for j in range(input_dim)] for i in range(data_num)]) / data_num
        self._target = np.array([i for i in range(data_num)]) / data_num
        self._batch_indices = list(range(data_num))

    def __call__(self, batch_size):
        while True:
            batch_indices = random.choices(self._batch_indices, k=batch_size)
            batch_input = np.array([self._input[i] for i in batch_indices])
            batch_target = np.array([self._target[i] for i in batch_indices])
            yield batch_input, batch_target


def linear(input_dim, output_dim, order=5):
    inputs = layers.Input(shape=(None, input_dim))
    z = layers.Dense(output_dim, use_bias=False, kernel_initializer=keras.initializers.Ones())(inputs)
    return keras.Model(inputs=inputs, outputs=z)


def check_model(model, test_generator):
    for layer in model.layers:
        if len(layer.get_weights()) != 0:
            print(layer.name, np.squeeze(layer.get_weights()))
    x, z = test_generator(batch_size).__next__()
    z_est = np.squeeze(model.predict_on_batch(x))
    print(z_est)
    print(z)
    print("loss", tf.keras.losses.MeanSquaredError()(z_est, z).numpy())


def main():
    epochs = 30

    model = linear(input_dim, output_dim, order=1)
    model.summary()
    model.compile(optimizer=tf.keras.optimizers.Adam(learning_rate=0.0), loss=tf.keras.losses.MeanSquaredError())

    train_generator = DummyGenerator()
    test_generator = DummyGenerator()

    check_model(model, test_generator)
    model.fit_generator(
        train_generator(batch_size),
        epochs=epochs,
        steps_per_epoch=10,
        verbose=1,
        validation_steps=4,
        validation_data=test_generator(batch_size)
    )
    check_model(model, test_generator)


main()

console log

Model: "model"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 input_1 (InputLayer)        [(None, None, 1)]         0         
                                                                 
 dense (Dense)               (None, None, 1)           1         
                                                                 
=================================================================
Total params: 1
Trainable params: 1
Non-trainable params: 0
_________________________________________________________________
dense 1.0
[0.85 0.5  0.49 0.02 0.38]
[0.85 0.5  0.49 0.02 0.38]
loss 1.3646416787805004e-16
python/main/ml/main_sub.py:55: UserWarning: `Model.fit_generator` is deprecated and will be removed in a future version. Please use `Model.fit`, which supports generators.
  model.fit_generator(
Epoch 1/30
10/10 [==============================] - 0s 8ms/step - loss: 0.1417 - val_loss: 0.1386
Epoch 2/30
10/10 [==============================] - 0s 2ms/step - loss: 0.1346 - val_loss: 0.1239
Epoch 3/30
10/10 [==============================] - 0s 2ms/step - loss: 0.1582 - val_loss: 0.1484
Epoch 4/30
10/10 [==============================] - 0s 2ms/step - loss: 0.1102 - val_loss: 0.1189
Epoch 5/30
10/10 [==============================] - 0s 2ms/step - loss: 0.1340 - val_loss: 0.1489
Epoch 6/30
10/10 [==============================] - 0s 2ms/step - loss: 0.1432 - val_loss: 0.0532
Epoch 7/30
10/10 [==============================] - 0s 2ms/step - loss: 0.1539 - val_loss: 0.1009
Epoch 8/30
10/10 [==============================] - 0s 2ms/step - loss: 0.1514 - val_loss: 0.1236
Epoch 9/30
10/10 [==============================] - 0s 2ms/step - loss: 0.1411 - val_loss: 0.1249
Epoch 10/30
10/10 [==============================] - 0s 2ms/step - loss: 0.1485 - val_loss: 0.0954
Epoch 11/30
10/10 [==============================] - 0s 2ms/step - loss: 0.1305 - val_loss: 0.1054
Epoch 12/30
10/10 [==============================] - 0s 2ms/step - loss: 0.1332 - val_loss: 0.0985
Epoch 13/30
10/10 [==============================] - 0s 2ms/step - loss: 0.0979 - val_loss: 0.0924
Epoch 14/30
10/10 [==============================] - 0s 2ms/step - loss: 0.1228 - val_loss: 0.1668
Epoch 15/30
10/10 [==============================] - 0s 2ms/step - loss: 0.1298 - val_loss: 0.1723
Epoch 16/30
10/10 [==============================] - 0s 2ms/step - loss: 0.1613 - val_loss: 0.1587
Epoch 17/30
10/10 [==============================] - 0s 2ms/step - loss: 0.1662 - val_loss: 0.1444
Epoch 18/30
10/10 [==============================] - 0s 2ms/step - loss: 0.1144 - val_loss: 0.1549
Epoch 19/30
10/10 [==============================] - 0s 2ms/step - loss: 0.1108 - val_loss: 0.1398
Epoch 20/30
10/10 [==============================] - 0s 2ms/step - loss: 0.1426 - val_loss: 0.0888
Epoch 21/30
10/10 [==============================] - 0s 2ms/step - loss: 0.1361 - val_loss: 0.1825
Epoch 22/30
10/10 [==============================] - 0s 2ms/step - loss: 0.1690 - val_loss: 0.1998
Epoch 23/30
10/10 [==============================] - 0s 2ms/step - loss: 0.1233 - val_loss: 0.1688
Epoch 24/30
10/10 [==============================] - 0s 2ms/step - loss: 0.1323 - val_loss: 0.1651
Epoch 25/30
10/10 [==============================] - 0s 2ms/step - loss: 0.1014 - val_loss: 0.1740
Epoch 26/30
10/10 [==============================] - 0s 2ms/step - loss: 0.1377 - val_loss: 0.1583
Epoch 27/30
10/10 [==============================] - 0s 2ms/step - loss: 0.1197 - val_loss: 0.1805
Epoch 28/30
10/10 [==============================] - 0s 2ms/step - loss: 0.1468 - val_loss: 0.1635
Epoch 29/30
10/10 [==============================] - 0s 2ms/step - loss: 0.1269 - val_loss: 0.1220
Epoch 30/30
10/10 [==============================] - 0s 2ms/step - loss: 0.1440 - val_loss: 0.1531
dense 1.0
[0.29 0.54 0.74 0.72 0.78]
[0.29 0.54 0.74 0.72 0.78]
loss 4.516209738774887e-16

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:12 (4 by maintainers)

github_iconTop GitHub Comments

1reaction
jvishnuvardhancommented, Feb 25, 2022

I am closing this issue as this was resolved. Feel free to reopen if I am mistaken. Thanks!

1reaction
chenmoneygithubcommented, Jan 21, 2022
  1. Not the final epoch, but the last batch in the final epoch. Each epoch consists of a bunch of batches.
  2. Exactly, please check this tutorial: https://www.tensorflow.org/guide/keras/writing_a_training_loop_from_scratch

Yes, it looks strange, but maybe it’s just because the last batch is easy to process, so writing a custom training loop should get us a more clear understanding.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Returned loss value is different than the loss printed with ...
The problem is that the ProgbarLogger prints an average of the values (loss, regularization loss, other metrics), which are the values shown ...
Read more >
Keras Loss Functions: Everything You Need to Know
In deep learning, the loss is computed to get the gradients with respect to model weights and update those weights accordingly via ...
Read more >
Losses - Keras
The purpose of loss functions is to compute the quantity that a model should seek to minimize during training. Available losses. Note that...
Read more >
TensorFlow Keras: print out and save loss and gradients ...
Is there a way to print out and also save the loss function value, the gradients, and norm of the gradients, for each...
Read more >
Loss Functions in TensorFlow - MachineLearningMastery.com
The loss metric is very important for neural networks. As all machine learning models are one optimization problem or another, the loss is ......
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found