Harder, Better, Faster, Stronger Convergence Rates for Least-Squares Regression
Aymeric Dieuleveut (SIERRA, LIENS), Nicolas Flammarion (LIENS,, SIERRA), Francis Bach (SIERRA, LIENS)

TL;DR
This paper introduces a new accelerated gradient descent algorithm for stochastic least-squares regression that achieves optimal convergence rates in prediction error, noise dependence, and dimension, matching known lower bounds.
Contribution
The paper presents the first algorithm that attains optimal prediction error rates for stochastic least-squares regression in various settings, including dimension-free bounds.
Findings
Achieves O(1/n^2) prediction error decay
Attains O(d/n) dependence on noise and dimension
Proven to be optimal via matching lower bounds in non-parametric regression
Abstract
We consider the optimization of a quadratic objective function whose gradients are only accessible through a stochastic oracle that returns the gradient at any given point plus a zero-mean finite variance random error. We present the first algorithm that achieves jointly the optimal prediction error rates for least-squares regression, both in terms of forgetting of initial conditions in O(1/n 2), and in terms of dependence on the noise and dimension d of the problem, as O(d/n). Our new algorithm is based on averaged accelerated regularized gradient descent, and may also be analyzed through finer assumptions on initial conditions and the Hessian matrix, leading to dimension-free quantities that may still be small while the " optimal " terms above are large. In order to characterize the tightness of these new bounds, we consider an application to non-parametric regression and use the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Sparse and Compressive Sensing Techniques · Machine Learning and Algorithms
