Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[RFC][WIP] Tensor Expression level automatic differentiation

See original GitHub issue

I’m working on automatic differentiation at the level of compute expressions, and I would like to share some progress and hear any comments. Currently the automatic differentiation works well enough for some operations, so that it is possible to train a simple model, here is a tutorial on how to do this. Yet, for many operations the performance is unacceptable, but I’m working on it.

My implementation mostly follows this paper. In this notebook I describe how my implementation works internally and give a list of operations which are known to work or not to work. Basically, the AD consists of two parts:

The automatic differentiation itself which simply differentiates expressions according to the well-known rules and produces inefficient expressions. The code is here.
A set of transformations to optimize the resulting inefficient expressions. The code is here.

All transformations work on the level of compute expressions (before scheduling). Their general goal is to eliminate summation over zeros by moving up conditional expressions of the form cond ? val : 0 and then using them to simplify iteration domains of reductions. Hopefully, these transformations may be useful for some other tasks besides AD when they are powerful enough. Currently the main problem is that they don’t understand modular arithmetic (which is needed for differentiating dilated and strided convolutions and for the flattening operation).

The git branch The squashed commit The tutorial on training a simple model The notebook describing some internals

Issue Analytics

State:
Created 5 years ago
Reactions:8
Comments:25 (25 by maintainers)

Top GitHub Comments

3reactions

sgrechanik-hcommented, Jun 19, 2019

Hello everyone. I want to tell you about the current status of tensor expression automatic differentiation. The latest version can be found here. The main improvements are as follows:

I’ve implemented a solver for systems of linear integer equations. This considerably improves performance of certain operations like dilated and strided convolutions.
I’ve redesigned the zero elimination module. Now there is a class Domain which represents an iteration domain (a set of integer tuples, usually convex), and most of the functions transform domains into other domains (returning objects of the class DomainTransformation representing two domains and variable mappings).
I’ve moved to the new simplifiers. This was important because the Halide simplifier assumes that division is Euclidean which leads to generation of incorrect code.

However there are several problems which are TVM-related and should be addressed before creating pull-requests:

TVM bound inference sometimes leads to such tensor bound expansion that the tensors can’t fit into memory. This is a known problem (#2104), however nobody knows how to solve it, as it seems. In the linked branch I use a simple fix which however breaks some tests by triggering a strange-looking assertion.
Certain parts of TVM still use Euclidean division which sometimes results in incorrect code being generated. Hopefully, this problem will be mostly fixed by @tqchen in #3368. (Although the PR is still unfinished, I use its slightly modified version in the autodiff branch).

2reactions

sgrechanik-hcommented, Dec 6, 2018

I’ve updated our automatic differentiation branch. Now the result of differentiating flatten is acceptable, and operations like max pool work better as well. We have also improved the API, the simple use-case look pretty much the same up to function renaming:

[dw1, dw2, dw3] = tvm.differentiate(loss, [w1, w2, w3])

(The function differentiate is defined here, and here is a small tutorial). However, now it is possible to get individual adjoints from the result:

x = tvm.placeholder((32, 3, 28, 28), name='x')
w1 = tvm.placeholder((10, 3, 3, 3), name='w1')
t1 = topi.nn.conv2d(x, w1, 1, 0, 1)
t2 = topi.nn.flatten(t1)
t3 = topi.sum(t2)

res = tvm.differentiate(t3)
res.adjoints[t1]

(It may be useful for manually scheduling intermediate tensors). And it is also possible to override Jacobian computation for some tensors:

def mydiff(out, inp, head):
    return tvm.compute(inp.shape, lambda ax0, ax1, ax2, ax3: head[ax0, ax3 + ax2*26 + ax1*676])

res = tvm.differentiate(t3, [x, w1], manual={(t2, t1): mydiff})

(Which may be useful when autodiff does poor job).

Top Results From Across the Web

[RFC][WIP] Tensor Expression level automatic differentiation

I'm working on automatic differentiation at the level of compute expressions, and I would like to share some progress and hear any comments....

Automatic differentiation on the level of TVM IR - pre-RFC

Hello, Currently in TVM/NNVM automatic differentiation is implemented only at the NNVM level and it requires manually defining gradients for every operation ...

[1711.01348] Automatic Differentiation for Tensor Algebras

Abstract: Kjolstad et. al. proposed a tensor algebra compiler. It takes expressions that define a tensor element-wise, such as f_{ij}(a,b,c ...

Introduction to gradients and automatic differentiation

In this guide, you will explore ways to compute gradients with TensorFlow, especially in eager execution. Setup. import numpy as np

Automatic Differentiation · Tensors.jl - JuliaHub

Tensors supports forward mode automatic differentiation (AD) of tensorial functions to compute first order derivatives (gradients) and second order derivatives ...