Optimization with Momentum: Dynamical, Control-Theoretic, and Symplectic Perspectives
Michael Muehlebach, Michael I. Jordan

TL;DR
This paper offers a comprehensive analysis of momentum-based optimization algorithms through dynamical systems, revealing how their convergence rates depend on parameters and highlighting the significance of symplectic discretization for acceleration.
Contribution
It introduces a unified dynamical systems framework for analyzing momentum algorithms, deriving explicit convergence rate formulas, and emphasizing the role of symplectic schemes in acceleration.
Findings
Closed-form expressions for convergence rates relating parameters to performance
Symplectic discretization schemes are crucial for accelerated convergence
Analysis applies to both discrete and continuous, convex and non-convex settings
Abstract
We analyze the convergence rate of various momentum-based optimization algorithms from a dynamical systems point of view. Our analysis exploits fundamental topological properties, such as the continuous dependence of iterates on their initial conditions, to provide a simple characterization of convergence rates. In many cases, closed-form expressions are obtained that relate algorithm parameters to the convergence rate. The analysis encompasses discrete time and continuous time, as well as time-invariant and time-variant formulations, and is not limited to a convex or Euclidean setting. In addition, the article rigorously establishes why symplectic discretization schemes are important for momentum-based optimization algorithms, and provides a characterization of algorithms that exhibit accelerated convergence.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Matrix Theory and Algorithms · Sparse and Compressive Sensing Techniques
