Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Compilation options of a multi-output model: multiple losses & loss weighting

See original GitHub issue

As described in the Keras handbook -Deep Learning with Pyhton-, for a multi-output model we need to specify different loss functions for different heads of the network. But because gradient descent requires you to minimize a scalar, you must combine these losses into a single value in order to train the model.

Very imbalanced loss contributions will cause the model representations to be optimized preferentially for the task with the largest individual loss, at the expense of the other tasks. To remedy this, you can assign different levels of importance to the loss values in their contribution to the final loss. This is useful in particular if the losses’ values use different scales.

Can anyone help with the following:

I’ve got a five-output model as described in #10120. The outputs of the model are the following:

emotion (multiclass, multilabel classification)
valence (regression)
arousal (regression)
dominance (regression)
age (multiclass classification)

I am using the following :

    losses_list = {'EMOTIONS': 'binary_crossentropy',
                   'VALENCE': 'mse',
                   'AROUSAL': 'mse',
                   'DOMINANCE': 'mse',
                   'AGE': 'categorical_crossentropy'}

    losses_weights = {'EMOTIONS': 1.0,
                      'VALENCE': 0.025,
                      'AROUSAL': 0.025,
                      'DOMINANCE': 0.025,
                      'AGE': 0.45}

    metrics ={'EMOTIONS': 'crossentropy',
              'VALENCE': 'mse',
              'AROUSAL': 'mse',
              'DOMINANCE': 'mse',
              'AGE': 'categorical_accuracy'}

Can anyone comment on this ? Are those the right loss functions? Are those the right weights and are those metrics properly set?

Issue Analytics

State:
Created 5 years ago
Reactions:3
Comments:10 (2 by maintainers)

Top GitHub Comments

48reactions

brge17commented, May 29, 2018

Here is a fully functioning example that may help you out. It is mnist as an autoencoder and classification at the same time.

import keras

from keras.datasets import mnist
from keras.models import Model
from keras.layers import Input, Conv2D, MaxPooling2D, Flatten, Dense, UpSampling2D

batch_size = 100
num_classes = 10
epochs = 10

# input image dimensions
img_rows, img_cols = 28, 28

# Data 
(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train = x_train.reshape(x_train.shape[0], img_rows, img_cols, 1).astype('float32') / 255
x_test = x_test.reshape(x_test.shape[0], img_rows, img_cols, 1).astype('float32') / 255
y_train = keras.utils.to_categorical(y_train, num_classes)
y_test = keras.utils.to_categorical(y_test, num_classes)

# Convolutional Encoder
input_img = Input(shape=(img_rows, img_cols, 1))
conv_1 = Conv2D(16, (3, 3), activation='relu', padding='same')(input_img)
pool_1 = MaxPooling2D((2, 2), padding='same')(conv_1)
conv_2 = Conv2D(8, (3, 3), activation='relu', padding='same')(pool_1)
pool_2 = MaxPooling2D((2, 2), padding='same')(conv_2)
conv_3 = Conv2D(8, (3, 3), activation='relu', padding='same')(pool_2)
encoded= MaxPooling2D((2, 2), padding='same')(conv_3)

# Classification
flatten = Flatten()(encoded)
fc = Dense(128, activation='relu')(flatten)
softmax = Dense(num_classes, activation='softmax', name='classification')(fc)

# Decoder
conv_4 = Conv2D(8, (3, 3), activation='relu', padding='same')(encoded)
up_1 = UpSampling2D((2, 2))(conv_4)
conv_5 = Conv2D(8, (3, 3), activation='relu', padding='same')(up_1)
up_2 = UpSampling2D((2, 2))(conv_5)
conv_6 = Conv2D(16, (3, 3), activation='relu')(up_2)
up_3 = UpSampling2D((2, 2))(conv_6)
decoded = Conv2D(1, (3, 3), activation='sigmoid', padding='same', name='autoencoder')(up_3)

model = Model(inputs=input_img, outputs=[softmax, decoded])

model.compile(loss={'classification': 'categorical_crossentropy', 
                    'autoencoder': 'binary_crossentropy'},
              loss_weights={'classification': 1.0,
                            'autoencoder': 0.5},
              optimizer='adam',
              metrics={'classification': 'accuracy', 'autoencoder': ['binary_crossentropy', 'mse']})

model.fit(x_train, 
          {'classification': y_train, 'autoencoder': x_train},
          batch_size=batch_size,
          epochs=epochs,
          validation_data= (x_test, {'classification': y_test, 'autoencoder': x_test}),
          verbose=1)

8reactions

GKalliatakiscommented, Jan 30, 2020

Very imbalanced loss contributions will cause the model representations to be optimized preferentially for the task with the largest individual loss, at the expense of the other tasks. To remedy this, you can assign different levels of importance to the loss values in their contribution to the final loss. This is useful in particular if the losses’ values use different scales. For instance, the mean squared error (MSE) loss used for the age-regression task typically takes a value around 3–5, whereas the cross-entropy loss used for the gender-classification task can be as low as 0.1. In such a situation, to balance the contribution of the different losses, you can assign a weight of 10 to the crossentropy loss and a weight of 0.25 to the MSE loss.

The above quote is taken from the Deep Learning with Python book. This is actually the only bit that I’ve found online that gives an actual example. I assume you need to figure out the values that each loss typically take and then assign weights accordingly.