TL;DR
This survey explores the integration of automatic differentiation techniques into machine learning, highlighting its applications, implementation methods, and the evolving terminology to clarify its role in the field.
Contribution
It provides a comprehensive overview of AD's application in machine learning, clarifies differentiation techniques, and discusses recent adoption trends like differentiable programming.
Findings
AD is increasingly adopted in machine learning workflows.
Different differentiation techniques are precisely defined and related.
AD's role is expanding with dynamic computational graphs and differentiable programming.
Abstract
Derivatives, mostly in the form of gradients and Hessians, are ubiquitous in machine learning. Automatic differentiation (AD), also called algorithmic differentiation or simply "autodiff", is a family of techniques similar to but more general than backpropagation for efficiently and accurately evaluating derivatives of numeric functions expressed as computer programs. AD is a small but established field with applications in areas including computational fluid dynamics, atmospheric sciences, and engineering design optimization. Until very recently, the fields of machine learning and AD have largely been unaware of each other and, in some cases, have independently discovered each other's results. Despite its relevance, general-purpose AD has been missing from the machine learning toolbox, a situation slowly changing with its ongoing adoption under the names "dynamic computational graphs"…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
