Accelerated Gradient Flow: Risk, Stability, and Implicit Regularization
Yue Sheng, Alnur Ali

TL;DR
This paper investigates the statistical risk, stability, and implicit regularization effects of accelerated gradient methods like Nesterov's and Polyak's in least squares regression, using continuous-time analysis to reveal complex interactions.
Contribution
It provides the first continuous-time analysis of accelerated gradient methods' risk and stability, connecting implicit regularization to early stopping and loss curvature.
Findings
Accelerated methods exhibit unique risk and stability behaviors.
Early stopping interacts complexly with acceleration and curvature.
Continuous-time analysis yields sharper insights than discrete methods.
Abstract
Acceleration and momentum are the de facto standard in modern applications of machine learning and optimization, yet the bulk of the work on implicit regularization focuses instead on unaccelerated methods. In this paper, we study the statistical risk of the iterates generated by Nesterov's accelerated gradient method and Polyak's heavy ball method, when applied to least squares regression, drawing several connections to explicit penalization. We carry out our analyses in continuous-time, allowing us to make sharper statements than in prior work, and revealing complex interactions between early stopping, stability, and the curvature of the loss function.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Sparse and Compressive Sensing Techniques · Statistical Methods and Inference
