Accelerated Gradient Flow: Risk, Stability, and Implicit Regularization

Yue Sheng; Alnur Ali

arXiv:2201.08311·stat.ML·January 21, 2022·1 cites

Accelerated Gradient Flow: Risk, Stability, and Implicit Regularization

Yue Sheng, Alnur Ali

PDF

Open Access

TL;DR

This paper investigates the statistical risk, stability, and implicit regularization effects of accelerated gradient methods like Nesterov's and Polyak's in least squares regression, using continuous-time analysis to reveal complex interactions.

Contribution

It provides the first continuous-time analysis of accelerated gradient methods' risk and stability, connecting implicit regularization to early stopping and loss curvature.

Findings

01

Accelerated methods exhibit unique risk and stability behaviors.

02

Early stopping interacts complexly with acceleration and curvature.

03

Continuous-time analysis yields sharper insights than discrete methods.

Abstract

Acceleration and momentum are the de facto standard in modern applications of machine learning and optimization, yet the bulk of the work on implicit regularization focuses instead on unaccelerated methods. In this paper, we study the statistical risk of the iterates generated by Nesterov's accelerated gradient method and Polyak's heavy ball method, when applied to least squares regression, drawing several connections to explicit penalization. We carry out our analyses in continuous-time, allowing us to make sharper statements than in prior work, and revealing complex interactions between early stopping, stability, and the curvature of the loss function.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Sparse and Compressive Sensing Techniques · Statistical Methods and Inference