Optimal non-asymptotic bound of the Ruppert-Polyak averaging without strong convexity
S\'ebastien Gadat, Fabien Panloup

TL;DR
This paper provides tight non-asymptotic bounds for the mean-squared error of Ruppert-Polyak averaged stochastic gradient descent, applicable to both strongly convex and weaker conditions like Kurdyka-Lojasiewicz, including non-convex cases.
Contribution
It establishes optimal non-asymptotic bounds for the algorithm under very general conditions, extending previous results to non-strongly convex and non-convex scenarios.
Findings
Bounds are tight and optimal with respect to the Cramer-Rao lower bound.
Results apply to non-strongly convex and non-convex functions.
Includes pathological examples like logistic regression and recursive quantile estimation.
Abstract
This paper is devoted to the non-asymptotic control of the mean-squared error for the Ruppert-Polyak stochastic averaged gradient descent introduced in the seminal contributions of [Rup88] and [PJ92]. In our main results, we establish non-asymptotic tight bounds (optimal with respect to the Cramer-Rao lower bound) in a very general framework that includes the uniformly strongly convex case as well as the one where the function f to be minimized satisfies a weaker Kurdyka-Lojiasewicz-type condition [Loj63, Kur98]. In particular, it makes it possible to recover some pathological examples such as on-line learning for logistic regression (see [Bac14]) and recursive quan- tile estimation (an even non-convex situation).
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMatrix Theory and Algorithms · Optimization and Variational Analysis · Numerical methods in inverse problems
