Accelerated SGD for Non-Strongly-Convex Least Squares

Aditya Varre; Nicolas Flammarion

arXiv:2203.01744·cs.LG·March 4, 2022

Accelerated SGD for Non-Strongly-Convex Least Squares

Aditya Varre, Nicolas Flammarion

PDF

Open Access

TL;DR

This paper introduces a practical accelerated stochastic gradient descent algorithm for non-strongly convex least squares regression, achieving optimal prediction error rates and fast initial condition forgetting.

Contribution

It presents the first practical accelerated SGD algorithm with optimal error dependence and proven convergence in the non-strongly convex least squares setting.

Findings

01

Achieves $O(d/t)$ prediction error rate.

02

Accelerates initial condition forgetting to $O(d/t^2)$.

03

Proves optimality with matching lower bounds.

Abstract

We consider stochastic approximation for the least squares regression problem in the non-strongly convex setting. We present the first practical algorithm that achieves the optimal prediction error rates in terms of dependence on the noise of the problem, as $O (d / t)$ while accelerating the forgetting of the initial conditions to $O (d / t^{2})$ . Our new algorithm is based on a simple modification of the accelerated gradient descent. We provide convergence results for both the averaged and the last iterate of the algorithm. In order to describe the tightness of these new bounds, we present a matching lower bound in the noiseless setting and thus show the optimality of our algorithm.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Sparse and Compressive Sensing Techniques · Domain Adaptation and Few-Shot Learning