question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Keras losses discussion

See original GitHub issue

I’ve been working with Keras for a little while now and one of my main frustrations are the way it handles losses. It appears restricted to the form of y_true, y_pred, where both tensors must have the same shape. In regular classification problems suchs as VOC, ImageNet, CIFAR, etc. this will work fine, but for more ‘exotic’ networks this poses issues.

For example, the MNIST siamese example network works around this issue by outputting the distance between two inputs as the model output and then computing the loss based on that distance.

Similarly, the Triplet Loss depends on not two but three images, meaning two distance measures. This is circumvented in this project by computing the loss inside the model, outputting that loss and using an “identity_loss” to compute the mean loss.

The variational autoencoder demo finds another way to resolve this, by adding a custom layer which adds to the loss targets directly and setting the models’ loss targets to None. This works apparently, but it is far from ideal as it feels like a massive workaround.

For RPN (Region Proposal Network) it is even more weird. RPN generates a variable number of proposals from an image and calculates the targets (similar to y_true) for these proposals. In other words, y_true cannot be computed beforehand and, nor can its shape be guessed beforehand. The only solution here would be to use an identity_loss as in the Triplet case or use a custom layer which adds a loss target as with the variational autoencoder.

There are multiple issues that discuss this problem.

This issue aims to be a discussion point on how to improve the current scenario.

My proposal would be to reduce the restrictions on the loss parameter for a Keras model and allow for arbitrary loss tensors. Below I will show some code to show how I imagine it (Triplet network example):

def create_base_network(input_shape):
    seq = Sequential()
    seq.add(Conv2D(20, (5,5), input_shape=input_shape))
    seq.add(MaxPool2D(pool_size=(2,2)))
    seq.add(Flatten())
    seq.add(Dense(128))

base_network = create_base_network((224, 224, 3))

query = Input((224, 224, 3))
positive = Input((224, 224, 3))
negative = Input((224, 224, 3))

query_embedding = base_network(query)
positive_embedding = base_network(positive)
negative_embedding = base_network(negative)

loss = Lambda(triplet_loss)([query_embedding, positive_embedding, negative_embedding])

model = Model(inputs=[query, positive, negative], outputs=None, loss=[loss])

That network wouldn’t require an output, but it does have a loss function. The deployed version would output an embedding, but this isn’t necessary for the training model.

EDIT: Basically this proposal would drop the relation between outputs and loss, and thus its restrictions. In addition, it would change the meaning of loss into that of a list of tensors which have to be minimized during training. outputs would simply be the list of tensors that get returned by the model after prediction.

@fchollet I would love to hear what you think of this proposal; is it something you would support? I can make an attempt to work on this but it is better to discuss such a significant change first. Will we have to worry about backwards compatibility, or should this be a change for the next major release of Keras?

Issue Analytics

  • State:closed
  • Created 6 years ago
  • Reactions:15
  • Comments:8 (2 by maintainers)

github_iconTop GitHub Comments

4reactions
0x00b1commented, Jul 31, 2017

I want to echo @hgaiser. It’s extremely difficult to implement the aforementioned RPN losses in a flexible and intuitive manner. I disagree with @hgaiser, however, losses on intermediate layers are not exotic (consider every model presented at CVPR or ICCV that uses RCNN-like anchor generation). ☺️

0reactions
statskhcommented, Feb 6, 2018

`class CustomVariationalLayers(keras.layers.Layer): def vae_loss(self,inputs): x=inputs[0] z_decoded=inputs[1] x=K.flatten(x) z_decoded=K.flatten(z_decoded) xent_loss=keras.metrics.binary_crossentropy(x,z_decoded) kl_loss=-5e-4*K.mean( 1+z_log_var-K.square(z_mean)-K.exp(z_log_var),axis=-1) return K.mean(xent_loss+kl_loss) def call(self, inputs): x=inputs[0] z_decoded=inputs[1] loss=self.vae_loss([x, z_decoded]) self.add_loss(loss, inputs=inputs) return x

y=CustomVariationalLayers()([input_img, z_decoded])`

To call CustomVariationalLayers(), why the arguments z_mean and z_log_var are not needed ?

Read more comments on GitHub >

github_iconTop Results From Across the Web

tf.keras.losses.Loss | TensorFlow v2.11.0
Loss base class. ... Strategy , outside of built-in training loops such as tf.keras compile and fit , please use 'SUM' or 'NONE'...
Read more >
Losses - Keras
Usage of losses with compile() & fit(). A loss function is one of the two arguments required for compiling a Keras model: from...
Read more >
Solving the TensorFlow Keras Model Loss Problem
How to Implement a Non-trivial TensorFlow Keras Loss Function ... We will focus our discussion on the tf.keras model.fit() high level API.
Read more >
keras/losses.py at master · keras-team/keras - GitHub
"""Initializes `Loss` class. Args: reduction: Type of `tf.keras.losses.Reduction` ...
Read more >
Ultimate Guide To Loss functions In Tensorflow Keras API ...
we are going to discuss the loss functions supported by the Tensorflow keras library with a standalone code usage in Python.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found