Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

How do we know which backend functions are differentiable / not?

See original GitHub issue

Hi, I’m trying to define custom loss functions for my model but have come across this error several times during development:

ValueError: An operation has `None` for gradient. Please make sure that all of your ops have a gradient defined (i.e. are differentiable). Common ops without gradient: K.argmax, K.round, K.eval.

I understand the need for differentiable loss functions, but I was wondering if there is any documentation on which functions in Keras are differentiable and which functions are not. My first instinct was that any of the Keras backend functions are differentiable and hence usable in my loss functions, but clearly (as seen in the error message) that is not the case.

I feel like it would be very helpful to have a list I can refer to, instead of discovering by trial and error as I have been doing so far. Does such a resource already exist, and if not, can we make one?

Issue Analytics

State:
Created 5 years ago
Reactions:6
Comments:12 (1 by maintainers)

Top GitHub Comments

6reactions

JunSherncommented, Jul 13, 2018

Hi @KrishnanParameswaran , I couldn’t figure out a perfect way to do it but I was able to get what I needed done through trial and error, using a simple example like the following:

# Loss function
def custom_loss(input_x, output_x):
    loss = output_x - input_x
    return loss

# Dummy model
input_layer = Input(shape=INPUT_SHAPE)
hidden_layer = Dense(10)(input_layer)
output_layer = Reshape(OUTPUT_SHAPE)(hidden_layer)

model = Model(input_layer, output_layer)
model.compile(optimizer='sgd', loss=test_loss)

# Train the model
model.fit(train_x, train_y)

Using this sort of trial and error I did take note of which ops I found to be usable and which were problematic:

Differentiable ops

Indexing
+, -, *, /
K.squeeze
K.spatial_2d_padding

Non-differentiable ops

K.argmax
K.round
K.eval
K.greater
K.cast bool to float, int to float

Also note that it’s not just the op itself which determines differentiability. It’s about how you chain the ops. For example, K.mean(y_true) has no gradient and will error. On the other hand, K.mean(y_true - y_pred) does have a gradient and can train without issue.

But once you have a simple example to test on, it’s actually not too difficult to get the loss you want. I managed to work out a number of custom losses by reformulating my objectives using common arithmetic operations which can actually be quite descriptive when used creatively.

3reactions

roya90commented, Mar 18, 2019

Hey, Is K.cast for float64 to float32 differentiable?

Top Results From Across the Web

Non Differentiable Functions - YouTube

Holes, jumps and vertical tangents result in non differentiable functions. Graphs of each, plus how to find vertical tangents algebraically.

How do I find the non differentiable operation in my layer?

I am trying to create a rather complex lambda-layer with many operations in keras. After I implemented it, I got a ValueError: No...

Differentiability of a Function | Class 12 Maths - GeeksforGeeks

Solution: As we know to check the differentiability we have to find out Lf' and Rf' then after comparing them we get to...

Identifying a Continuous Function that May Fail to ... - Study.com

Learn how to identify a continuous function that may fail to be differentiable at a point in its domain, and see examples that...

Extending PyTorch — PyTorch 1.13 documentation

Step 3: If your Function does not support double backward you should explicitly declare this by decorating backward with the once_differentiable() . With...