On the Correctness of Automatic Differentiation for Neural Networks with Machine-Representable Parameters
Wonyeol Lee, Sejun Park, Alex Aiken

TL;DR
This paper investigates the correctness of automatic differentiation in neural networks with machine-representable parameters, providing theoretical bounds and conditions for correctness and non-differentiability issues.
Contribution
It offers the first rigorous analysis of AD correctness on machine-representable parameters, including bounds on non-differentiable sets and conditions for correctness.
Findings
Incorrect set of parameters is always empty for networks with bias.
The size of the non-differentiable set is linearly bounded by activation non-differentiabilities.
AD computes a Clarke subderivative even on non-differentiable points.
Abstract
Recent work has shown that forward- and reverse- mode automatic differentiation (AD) over the reals is almost always correct in a mathematically precise sense. However, actual programs work with machine-representable numbers (e.g., floating-point numbers), not reals. In this paper, we study the correctness of AD when the parameter space of a neural network consists solely of machine-representable numbers. In particular, we analyze two sets of parameters on which AD can be incorrect: the incorrect set on which the network is differentiable but AD does not compute its derivative, and the non-differentiable set on which the network is non-differentiable. For a neural network with bias parameters, we first prove that the incorrect set is always empty. We then prove a tight bound on the size of the non-differentiable set, which is linear in the number of non-differentiabilities in activation…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsMachine Learning and Algorithms · Receptor Mechanisms and Signaling · Model Reduction and Neural Networks
