Non-parametric Stochastic Approximation with Large Step sizes

Aymeric Dieuleveut; Francis Bach

arXiv:1408.0361·math.ST·March 30, 2016

Non-parametric Stochastic Approximation with Large Step sizes

Aymeric Dieuleveut, Francis Bach

PDF

TL;DR

This paper demonstrates that using large step sizes in a stochastic gradient approach for kernel-based regression achieves optimal convergence rates across different smoothness regimes, even when the true predictor isn't in the RKHS.

Contribution

It introduces a non-parametric stochastic approximation method with large step sizes that attains optimal convergence rates in RKHS regression.

Findings

01

Large step sizes improve convergence speed.

02

Optimal rates achieved for various smoothness conditions.

03

Method works even if the true predictor is outside the RKHS.

Abstract

We consider the random-design least-squares regression problem within the reproducing kernel Hilbert space (RKHS) framework. Given a stream of independent and identically distributed input/output data, we aim to learn a regression function within an RKHS $H$ , even if the optimal predictor (i.e., the conditional expectation) is not in $H$ . In a stochastic approximation framework where the estimator is updated after each observation, we show that the averaged unregularized least-mean-square algorithm (a form of stochastic gradient), given a sufficient large step-size, attains optimal rates of convergence for a variety of regimes for the smoothnesses of the optimal prediction function and the functions in $H$ .

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.