Automatic Differentiation in PCF
Damiano Mazza, Michele Pagani

TL;DR
This paper proves that automatic differentiation in a higher-order, Turing-complete language is almost everywhere correct, with failures only on a measure-zero set, under mild conditions on primitive functions.
Contribution
It establishes the almost everywhere correctness of AD in PCF with real numbers, providing a precise characterization of failure points under common primitive functions.
Findings
AD is almost everywhere correct in PCF with real numbers.
Failure points form a measure-zero set, often described by polynomial zero sets.
Results apply to both forward and reverse mode AD.
Abstract
We study the correctness of automatic differentiation (AD) in the context of a higher-order, Turing-complete language (PCF with real numbers), both in forward and reverse mode. Our main result is that, under mild hypotheses on the primitive functions included in the language, AD is almost everywhere correct, that is, it computes the derivative or gradient of the program under consideration except for a set of Lebesgue measure zero. Stated otherwise, there are inputs on which AD is incorrect, but the probability of randomly choosing one such input is zero. Our result is in fact more precise, in that the set of failure points admits a more explicit description: for example, in case the primitive functions are just constants, addition and multiplication, the set of points where AD fails is contained in a countable union of zero sets of non-identically-zero polynomials.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
