Differentiating a Tensor Language
Gilbert Bernstein, Michael Mara, Tzu-Mao Li, Dougal Maclaurin, and Jonathan Ragan-Kelley

TL;DR
This paper introduces a novel, purely functional approach to differentiating tensor programs that guarantees efficiency and avoids asymptotic slowdowns, addressing limitations of existing AD methods.
Contribution
It presents the first provably efficient, purely functional reverse-mode differentiation method for tensor code that explicitly accounts for sparsity and introduces new formalisms like Tensor SSA.
Findings
Naive differentiation can cause asymptotic slowdowns in tensor programs
Existing AD methods relying on mutations complicate optimization
The proposed method guarantees efficiency and handles sparsity explicitly
Abstract
How does one compile derivatives of tensor programs, such that the resulting code is purely functional (hence easier to optimize and parallelize) and provably efficient relative to the original program? We show that naively differentiating tensor code---as done in popular systems like Tensorflow and PyTorch---can cause asymptotic slowdowns in pathological cases, violating the Cheap Gradients Principle. However, all existing automatic differentiation methods that guarantee this principle (for variable size data) do so by relying on += mutation through aliases/pointers---which complicates downstream optimization. We provide the first purely functional, provably efficient, adjoint/reverse-mode derivatives of array/tensor code by explicitly accounting for sparsity. We do this by focusing on the indicator function from Iverson's APL. We also introduce a new "Tensor SSA" normal form and a new…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsParallel Computing and Optimization Techniques · Computational Physics and Python Applications · Tensor decomposition and applications
