The Physical Systems Behind Optimization Algorithms
Lin F. Yang, R. Arora, V. Braverman, Tuo Zhao

TL;DR
This paper employs differential equations to analyze the dynamics of various optimization algorithms in machine learning, providing physics-inspired insights applicable beyond convex settings.
Contribution
It introduces a unified physics-based framework to analyze multiple optimization algorithms, extending analysis to nonconvex and general conditions.
Findings
Unified physical perspective on optimization algorithms
Applicable to nonconvex and general problem settings
Insights into algorithm dynamics and convergence
Abstract
We use differential equations based approaches to provide some {\it \textbf{physics}} insights into analyzing the dynamics of popular optimization algorithms in machine learning. In particular, we study gradient descent, proximal gradient descent, coordinate gradient descent, proximal coordinate gradient, and Newton's methods as well as their Nesterov's accelerated variants in a unified framework motivated by a natural connection of optimization algorithms to physical systems. Our analysis is applicable to more general algorithms and optimization problems {\it \textbf{beyond}} convexity and strong convexity, e.g. Polyak-\L ojasiewicz and error bound conditions (possibly nonconvex).
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Sparse and Compressive Sensing Techniques · Advanced Optimization Algorithms Research
