question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Gradient of tf.pow

See original GitHub issue

There are two issues that I’d like to point out with tf.pow’s gradient function:

First, there’s bug with the gradient of tf.pow w.r.t. the exponent, whereby a non-positive base will result in a NaN for the corresponding exponent’s gradient. This can be reproduced using:

tf.version_core == "0.8.1";
tf.grad(x => tf.pow(tf.scalar(0), x))(tf.scalar(2)).dataSync();
tf.grad(x => tf.pow(tf.scalar(-2), x))(tf.scalar(2)).dataSync();

The backend pow is implemented as pow(abs(a), b) * isEven(round(a)), where isEven is an indicator function, isEven: X \to {1,-1}. Despite the backend, the gradient function takes the log of the base, which may be negative, rather than the non-negative base used in the backend, which was hit with an abs. One solution is to have an “unprotected” backend pow (which only receives non-negative bases) and have the abs and the isEven components in the tf.pow op definition, which would be cached for the backward pass. Although this takes care of the gradient for when the base is negative, we still have an issue for when the base is 0.

When the base is 0, both the gradient of the base and the exponent will be NaN:

tf.version_core == "0.8.1";
tf.grad(x => tf.pow(x, tf.scalar(2)))(tf.scalar(0)).dataSync(); // grad of base
tf.grad(x => tf.pow(tf.scalar(0), x))(tf.scalar(2)).dataSync(); // grad of exponent

As just explained, when the base is 0, the gradient will be NaN for the exponent because we’re taking log(0), and will be NaN for the base because we’re dividing by 0. A simple solution would be to zero-out the gradient of both the base and the exponent wherever the base is 0. This makes sense for the gradients (for a^x when a=0) because

  • the derivative w.r.t. the exponent will be a^x * log(a) = 0^x*log(0) = 0 * log(0)
  • the derivative w.r.t. the base will be x*0^(x-1) = 0

Fortunately, I believe that using the exponent’s gradient is a rare use case because there’s rarely a path from a variable to an exponent (am I wrong to assume this?), so fixing the gradient of the exponent wouldn’t be such a high priority. However, fixing the gradient of the base should be done because it is commonly encountered, eg in regularization; if someone is using tf.pow (and not tf.square) and a parameter reaches 0, then a NaN will be produced. This will eventually propagate to variables and mess everything up.

Would y’all be open to these changes?

Issue Analytics

  • State:closed
  • Created 5 years ago
  • Reactions:1
  • Comments:8 (7 by maintainers)

github_iconTop GitHub Comments

1reaction
jgartmancommented, May 27, 2018

Opened a separate issue for the forward func #350.

0reactions
nsthoratcommented, Mar 13, 2019

This is fixed!

Read more comments on GitHub >

github_iconTop Results From Across the Web

Why tensorflow pow will add a Log in gradient calculation
The TensorFlow generated graph like this: tf.pow genreate Log op in gradient caculate. When I changed the code to the following: eq2 =...
Read more >
tf.gradients | TensorFlow v2.11.0
gradients () adds ops to the graph to output the derivatives of ys with respect to xs . It returns a list of...
Read more >
2-3 Automatic Differentiate - 30天吃掉那只Tensorflow2
tf.GradientTape is usually used to record forward calculation in Tensorflow, and reverse this "tape" to obtain the gradient.
Read more >
Understanding TensorFlow: Part 3–2 | by dan lee | Medium
tf.Variable. Since in most cases, you will want to calculate gradients with respect to a model's trainable variables. Here before we introduce ...
Read more >
Optimizing in TF with automatic differentiation - Kaggle
Initialize x and y with a random value · Find the gradient of the function at these values · Adjust the values using...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found