Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Higher order derivatives return wrong results

See original GitHub issue

When trying to compute higher order derivatives of a very simple function:

def inv(x):
    return 1/X

print(grad(grad(grad(grad(grad(grad(inv))))))(4.))

It prints -1.9560547 instead of the correct 0.0439453125. It starts returning incorrect results at 4. It still gets it right with 3.999

Issue Analytics

State:
Created 3 years ago
Comments:7 (6 by maintainers)

Top GitHub Comments

1reaction

shoyercommented, May 18, 2020

If we look at what’s going on, this appears to be a case where the higher order gradient calculation is unstable and also rather inefficient:

>>> make_jaxpr(grad(grad(grad(lambda x: 1/x))))(4.)
{ lambda  ; a.
  let b = mul a a
      c = mul b b
      d = div -1.0 c
      e = mul d 1.0
      f = neg e
      g = mul 1.0 f
      h = mul 1.0 f
      i = add_any g h
      j = mul 1.0 a
      k = mul 1.0 a
      l = add_any j k
      m = neg l
      n = mul m 1.0
      o = mul c c
      p = div n o
      q = mul p -1.0
      r = neg q
      s = mul r b
      t = mul r b
      u = add_any s t
      v = mul u a
      w = add_any i v
      x = mul u a
      y = add_any w x
  in (y,) }

The source of the issue is that grad(inv) turns into -1/(x*x) instead of -1/x**2:

>>> make_jaxpr(grad(lambda x: 1/x))(4.)
{ lambda  ; a.
  let b = mul a a
      c = div 1.0 b
      d = mul c 1.0
      e = neg d
  in (e,) }
>>> make_jaxpr(grad(lambda x: x**-1))(4.)
{ lambda  ; a.
  let b = pow a -2.0
      c = mul -1.0 b
      d = mul 1.0 c
  in (d,) }

The later is much more efficient for higher order differentiation:

>>> make_jaxpr(grad(grad(grad(lambda x: x**-1))))(4.)
{ lambda  ; a.
  let b = pow a -4.0
      c = mul -3.0 b
      d = mul 2.0 c
  in (d,) }

It seems like we should consider defining the gradient derivative rule for division in terms of pow rather than square (which should perhaps also be written in terms of pow). The pow primitive in turn would also probably need to be rewritten to include special case logic for efficient integer powers.

0reactions

shoyercommented, May 18, 2020

Yes, an integer_power primitive could work well. Possibly restricted to statically known powers?

Top Results From Across the Web

Higher Order Derivatives - YouTube

This calculus video tutorial provides a basic introduction into higher order derivatives. it explains how to find the second derivative of a ...

Understanding higher order derivative result

1 Answer 1 ... I can reproduce the error with version 12.0.0 ; however, it has been fixed, and with either version 12.1.1...

Difficulties in understanding higher order derivatives for tf ...

Based on what I played with some examples, I believe you are correct. The official doc is somehow ambiguous (or incorrect). The second_order_ ......

1.7: Higher Order Derivatives - Mathematics LibreTexts

Now f′(x) is once again a function. So we can differentiate it again, assuming that it is differentiable, to create a third function,...

6. Higher order derivatives, functions and matrix formulation

Using finite differences, we can construct derivatives up to any order. Before we discuss this, we first explicitly describe in detail the second-order...