Multiscale analysis of accelerated gradient methods
Mohammad Farazmand

TL;DR
This paper analyzes the behavior of accelerated gradient methods by identifying a slow manifold in the continuous limit, providing explicit approximations, and demonstrating their implications through examples.
Contribution
It introduces a geometric singular perturbation framework to characterize the slow manifold of accelerated gradient flow, offering explicit high-order approximations.
Findings
The accelerated gradient flow has an attracting slow manifold.
The slow manifold approximation can be made arbitrarily accurate.
Reduced flow on the slow manifold aligns with standard gradient descent.
Abstract
Accelerated gradient descent iterations are widely used in optimization. It is known that, in the continuous-time limit, these iterations converge to a second-order differential equation which we refer to as the accelerated gradient flow. Using geometric singular perturbation theory, we show that, under certain conditions, the accelerated gradient flow possesses an attracting invariant slow manifold to which the trajectories of the flow converge asymptotically. We obtain a general explicit expression in the form of functional series expansions that approximates the slow manifold to any arbitrary order of accuracy. To the leading order, the accelerated gradient flow reduced to this slow manifold coincides with the usual gradient descent. We illustrate the implications of our results on three examples.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
