TL;DR
This paper presents an optimized implementation of dual-numbers reverse-mode automatic differentiation that achieves correct complexity, supports higher-order languages, and enables task-parallel differentiation in functional programming.
Contribution
It introduces a linear factoring optimization for dual-numbers reverse AD, enabling efficient, correct, and parallelizable differentiation in standard functional languages.
Findings
Achieves correct complexity with linear factoring optimization.
Supports differentiation of most Haskell98 programs.
Enables task-parallel reverse-mode AD for functional programs.
Abstract
Where dual-numbers forward-mode automatic differentiation (AD) pairs each scalar value with its tangent value, dual-numbers reverse-mode AD attempts to achieve reverse AD using a similarly simple idea: by pairing each scalar value with a backpropagator function. Its correctness and efficiency on higher-order input languages have been analysed by Brunel, Mazza and Pagani, but this analysis used a custom operational semantics for which it is unclear whether it can be implemented efficiently. We take inspiration from their use of linear factoring to optimise dual-numbers reverse-mode AD to an algorithm that has the correct complexity and enjoys an efficient implementation in a standard functional language with support for mutable arrays, such as Haskell. Aside from the linear factoring ingredient, our optimisation steps consist of well-known ideas from the functional programming community.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
