question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Eigh gradients can be wrong

See original GitHub issue

In a lot of cases, eigh doesn’t return the correct gradients. I attached a very simple example, comparing JAX gradients (incorrect ones) with autograd gradients (correct ones),

JAX gradients computation:

from jax import grad
import jax.numpy as jnp

a = jnp.array([[1.93865817, 0.35509264, 0.64405863, 1.3430815 , 0.92342772],
               [0.35509264, 1.30340895, 0.90821576, 0.7199631 , 1.49002679],
               [0.64405863, 0.90821576, 0.50616539, 0.97171981, 0.9621489 ],
               [1.3430815 , 0.7199631 , 0.97171981, 1.27540148, 1.44715998],
               [0.92342772, 1.49002679, 0.9621489 , 1.44715998, 1.75984414]])


def fun(x):
    e,v = jnp.linalg.eigh(x)
    return jnp.sum(jnp.abs(v))

diff_fun = (grad(fun))
print(diff_fun(a))

JAX Output: (incorrect)

[[-0.33112606 0.32743236 -0.18718082 0.643335 -0.30847853] [ 0.32743236 0.5745873 -0.07279176 -0.0815831 -0.50257957] [-0.18718082 -0.07279176 -0.4819119 0.29376295 0.19632888] [ 0.643335 -0.0815831 0.29376295 -0.4126568 -0.28805062] [-0.30847853 -0.50257957 0.19632888 -0.28805062 0.65110785]]

Autograd gradients computation script:

import autograd.numpy as npa
from autograd import elementwise_grad as grad  

a = npa.array([[1.93865817, 0.35509264, 0.64405863, 1.3430815 , 0.92342772],
               [0.35509264, 1.30340895, 0.90821576, 0.7199631 , 1.49002679],
               [0.64405863, 0.90821576, 0.50616539, 0.97171981, 0.9621489 ],
               [1.3430815 , 0.7199631 , 0.97171981, 1.27540148, 1.44715998],
               [0.92342772, 1.49002679, 0.9621489 , 1.44715998, 1.75984414]])
def fun(x):
    e,v = npa.linalg.eigh(x)
    return npa.sum(npa.abs(v))

diff_fun = (grad(fun))
print(diff_fun(a))

Autograd Output: (correct)

[[-0.33112716 0.52218473 0.7270792 0.67123525 -1.13012818] [ 0.13268013 0.57458894 1.47796036 0.09381337 -1.45197577] [-1.10143499 -1.62353805 -0.48191193 0.63382524 1.86241553] [ 0.61543807 -0.25697898 -0.04630763 -0.41265955 0.06441824] [ 0.51316611 0.44681027 -1.46975969 -0.64051274 0.65110971]]

It’s worth noting that the mistake maybe related exclusively to the eigenvectors. When using a primitive function that only uses the eigenvalues, JAX returns the correct gradients, for example:

def fun(x):
    e,v = jnp.linalg.eigh(x)
    return jnp.sum(jnp.abs(e))

Issue Analytics

  • State:closed
  • Created a year ago
  • Comments:5 (1 by maintainers)

github_iconTop GitHub Comments

5reactions
buwantaijicommented, Jul 1, 2022

I notice that the gradient output of Jax is just the symmetric part of that of autograd.

grad_jax = jnp.array([[-0.33112606, 0.32743236, -0.18718082, 0.643335, -0.30847853],
[ 0.32743236, 0.5745873, -0.07279176, -0.0815831, -0.50257957],
[-0.18718082, -0.07279176, -0.4819119, 0.29376295, 0.19632888],
[ 0.643335, -0.0815831, 0.29376295, -0.4126568, -0.28805062],
[-0.30847853, -0.50257957, 0.19632888, -0.28805062, 0.65110785]])

grad_autograd = jnp.array([[-0.33112716, 0.52218473, 0.7270792, 0.67123525, -1.13012818],
[ 0.13268013, 0.57458894, 1.47796036, 0.09381337, -1.45197577],
[-1.10143499, -1.62353805, -0.48191193, 0.63382524, 1.86241553],
[ 0.61543807, -0.25697898, -0.04630763, -0.41265955, 0.06441824],
[ 0.51316611, 0.44681027, -1.46975969, -0.64051274, 0.65110971]])

print(grad_jax - 0.5 * (grad_autograd + grad_autograd.T))

The output is

[[ 1.1026859e-06 -5.9604645e-08 -2.9504299e-06 -1.6689301e-06 2.4735928e-06]
 [-5.9604645e-08 -1.6689301e-06 -2.9280782e-06 -2.8312206e-07 3.2186508e-06]
 [-2.9504299e-06 -2.9280782e-06  2.9802322e-08  4.1425228e-06 9.5367432e-07]
 [-1.6689301e-06 -2.8312206e-07  4.1425228e-06  2.7418137e-06 -3.3676624e-06]
 [ 2.4735928e-06  3.2186508e-06  9.5367432e-07 -3.3676624e-06 -1.8477440e-06]]

Since the orginal matrix a to be diagonalized is symmetric, this discrepancy between Jax and autograd results is not really a bug. In practical applications, the symmetry property of a is usually guaranteed by some upstream computations, and whether the adjoint of a itself is symmetric or not would not affect those gradients of real interest.

1reaction
randolf-scholzcommented, May 31, 2022

Notable observation: the V-matrix in the eigenvalue decomposition is not unique because $VΛV^⊤ = (V D) Λ (VD)^⊤$ for any matrix $D=diag(±1)$.

Running your code, we can see that the output for Jax is

array([[ 0.07,  0.11, -0.49, -0.74,  0.44],
       [-0.44, -0.14, -0.57,  0.55,  0.4 ],
       [ 0.59, -0.72,  0.05,  0.11,  0.34],
       [-0.55, -0.21,  0.62, -0.2 ,  0.48],
       [ 0.38,  0.63,  0.23,  0.31,  0.55]], dtype=float32)

whereas for autograd it is

array([[-0.07,  0.11,  0.49, -0.74, -0.44],
       [ 0.44, -0.14,  0.57,  0.55, -0.4 ],
       [-0.59, -0.72, -0.05,  0.11, -0.34],
       [ 0.55, -0.21, -0.62, -0.2 , -0.48],
       [-0.38,  0.63, -0.23,  0.31, -0.55]])

I.e. $D = diag([-1, +1, -1, +1, -1])$. Of course $‖V‖₁,₁=∑ᵢⱼ|Vᵢⱼ|$ and the gradient should be identical either way.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Why do TensorFlow and PyTorch gradients of the eigenvalue ...
The gradient with respect to a matrix that's guaranteed to be symmetric isn't really well-defined (off the diagonal), since a valid ...
Read more >
Detection of the Ran-regulated mitotic Rango gradient in ...
... Average linescans of Rango donor fluorescence lifetime (eight gradients from ... community as it can result in false positive in clonal experiments....
Read more >
tf.Variable | TensorFlow v2.11.0
A variable maintains shared, persistent state manipulated by a program. The Variable() constructor requires an initial value for the variable, which can be ......
Read more >
torch.linalg.svd — PyTorch 1.13 documentation
In the rectangular case, the gradient will also be numerically unstable when A ... torch.linalg.eigh() for a (faster) function that computes the eigenvalue ......
Read more >
Dell S2417DG Bad Gradients and Banding in Dark Images
1:10, 4:19 - I said laptop, I meant monitor.I've tried looking into this problem online and all I can find is either people...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found