Optimal non-asymptotic bound of the Ruppert-Polyak averaging without   strong convexity

S\'ebastien Gadat; Fabien Panloup

arXiv:1709.03342·math.ST·September 12, 2017·26 cites

Optimal non-asymptotic bound of the Ruppert-Polyak averaging without strong convexity

S\'ebastien Gadat, Fabien Panloup

PDF

Open Access

TL;DR

This paper provides tight non-asymptotic bounds for the mean-squared error of Ruppert-Polyak averaged stochastic gradient descent, applicable to both strongly convex and weaker conditions like Kurdyka-Lojasiewicz, including non-convex cases.

Contribution

It establishes optimal non-asymptotic bounds for the algorithm under very general conditions, extending previous results to non-strongly convex and non-convex scenarios.

Findings

01

Bounds are tight and optimal with respect to the Cramer-Rao lower bound.

02

Results apply to non-strongly convex and non-convex functions.

03

Includes pathological examples like logistic regression and recursive quantile estimation.

Abstract

This paper is devoted to the non-asymptotic control of the mean-squared error for the Ruppert-Polyak stochastic averaged gradient descent introduced in the seminal contributions of [Rup88] and [PJ92]. In our main results, we establish non-asymptotic tight bounds (optimal with respect to the Cramer-Rao lower bound) in a very general framework that includes the uniformly strongly convex case as well as the one where the function f to be minimized satisfies a weaker Kurdyka-Lojiasewicz-type condition [Loj63, Kur98]. In particular, it makes it possible to recover some pathological examples such as on-line learning for logistic regression (see [Bac14]) and recursive quan- tile estimation (an even non-convex situation).

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMatrix Theory and Algorithms · Optimization and Variational Analysis · Numerical methods in inverse problems