Non-asymptotic Analysis of Biased Stochastic Approximation Scheme

Belhal Karimi; Blazej Miasojedow; Eric Moulines; Hoi-To Wai

arXiv:1902.00629·stat.ML·June 18, 2019·27 cites

Non-asymptotic Analysis of Biased Stochastic Approximation Scheme

Belhal Karimi, Blazej Miasojedow, Eric Moulines, Hoi-To Wai

PDF

Open Access

TL;DR

This paper provides a non-asymptotic analysis of a general stochastic approximation scheme that relaxes common assumptions, applicable to complex tasks like reinforcement learning with biased and non-gradient updates.

Contribution

It extends stochastic approximation analysis to non-convex, biased, and Markov-dependent settings, covering algorithms like online EM and policy-gradient methods.

Findings

01

Analyzes SA with biased, non-gradient, and Markov-dependent updates.

02

Applies to online EM and reinforcement learning algorithms.

03

Provides convergence guarantees under relaxed assumptions.

Abstract

Stochastic approximation (SA) is a key method used in statistical learning. Recently, its non-asymptotic convergence analysis has been considered in many papers. However, most of the prior analyses are made under restrictive assumptions such as unbiased gradient estimates and convex objective function, which significantly limit their applications to sophisticated tasks such as online and reinforcement learning. These restrictions are all essentially relaxed in this work. In particular, we analyze a general SA scheme to minimize a non-convex, smooth objective function. We consider update procedure whose drift term depends on a state-dependent Markov chain and the mean field is not necessarily of gradient type, covering approximate second-order method and allowing asymptotic bias for the one-step updates. We illustrate these settings with the online EM algorithm and the policy-gradient…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Advanced Bandit Algorithms Research · Gaussian Processes and Bayesian Inference