Understanding the Acceleration Phenomenon via High-Resolution Differential Equations
Bin Shi, Simon S. Du, Michael I. Jordan, Weijie J. Su

TL;DR
This paper introduces high-resolution differential equations to better understand and distinguish between accelerated gradient methods, revealing new insights into their convergence behaviors and leading to the development of novel optimization algorithms.
Contribution
The paper develops high-resolution ODEs that accurately differentiate between Nesterov's accelerated method and the heavy-ball method, and uncovers new convergence properties and algorithmic variants.
Findings
High-resolution ODEs distinguish between NAG-SC and heavy-ball methods.
NAG-C minimizes squared gradient norm at an inverse cubic rate.
New optimization methods maintain NAG-C's accelerated convergence.
Abstract
Gradient-based optimization algorithms can be studied from the perspective of limiting ordinary differential equations (ODEs). Motivated by the fact that existing ODEs do not distinguish between two fundamentally different algorithms---Nesterov's accelerated gradient method for strongly convex functions (NAG-SC) and Polyak's heavy-ball method---we study an alternative limiting process that yields high-resolution ODEs. We show that these ODEs permit a general Lyapunov function framework for the analysis of convergence in both continuous and discrete time. We also show that these ODEs are more accurate surrogates for the underlying algorithms; in particular, they not only distinguish between NAG-SC and Polyak's heavy-ball method, but they allow the identification of a term that we refer to as "gradient correction" that is present in NAG-SC but not in the heavy-ball method and is…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSparse and Compressive Sensing Techniques · Advanced Optimization Algorithms Research · Stochastic Gradient Optimization Techniques
