A mathematical model for automatic differentiation in machine learning
Jerome Bolte (TSE), Edouard Pauwels (IRIT-ADRIA)

TL;DR
This paper develops a mathematical framework for automatic differentiation tailored to modern machine learning, addressing issues like nonsmooth functions and artificial critical points.
Contribution
It introduces a simple class of functions and a nonsmooth calculus, linking program differentiation with nonsmooth function differentiation, and analyzes the impact on stochastic methods.
Findings
Addresses nonsmooth functions in differentiation
Shows how to avoid artificial critical points
Provides a new mathematical model for automatic differentiation
Abstract
Automatic differentiation, as implemented today, does not have a simple mathematical model adapted to the needs of modern machine learning. In this work we articulate the relationships between differentiation of programs as implemented in practice and differentiation of nonsmooth functions. To this end we provide a simple class of functions, a nonsmooth calculus, and show how they apply to stochastic approximation methods. We also evidence the issue of artificial critical points created by algorithmic differentiation and show how usual methods avoid these points with probability one.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsComputability, Logic, AI Algorithms · Numerical Methods and Algorithms · Advanced Bandit Algorithms Research
