Accelerating Optimization and Reinforcement Learning with   Quasi-Stochastic Approximation

Shuhang Chen; Adithya Devraj; Andrey Bernstein; Sean Meyn

arXiv:2009.14431·math.OC·October 2, 2020

Accelerating Optimization and Reinforcement Learning with Quasi-Stochastic Approximation

Shuhang Chen, Adithya Devraj, Andrey Bernstein, Sean Meyn

PDF

Open Access

TL;DR

This paper extends the convergence theory of stochastic approximation to quasi-stochastic algorithms with deterministic signals, providing convergence rates, finite-time approximations, and applications to optimization and reinforcement learning.

Contribution

It develops a minimal-assumption theory for quasi-stochastic approximation, establishing convergence rates and finite-time behavior, and applies it to optimization and reinforcement learning algorithms.

Findings

01

Convergence rate of $1/t^ ho$ for gain $a_t=g/(1+t)^ ho$ with $ ho ext{ in }(0,1)$.

02

Finite-$t$ approximation involving a bounded zero-mean process.

03

Averaging achieves $1/t$ convergence only if the bias term $ar{Y}$ is zero.

Abstract

The ODE method has been a workhorse for algorithm design and analysis since the introduction of the stochastic approximation. It is now understood that convergence theory amounts to establishing robustness of Euler approximations for ODEs, while theory of rates of convergence requires finer analysis. This paper sets out to extend this theory to quasi-stochastic approximation, based on algorithms in which the "noise" is based on deterministic signals. The main results are obtained under minimal assumptions: the usual Lipschitz conditions for ODE vector fields, and it is assumed that there is a well defined linearization near the optimal parameter $θ^{*}$ , with Hurwitz linearization matrix $A^{*}$ . The main contributions are summarized as follows: (i) If the algorithm gain is $a_{t} = g / (1 + t)^{ρ}$ with $g > 0$ and $ρ \in (0, 1)$ , then the rate of convergence of the algorithm is…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic processes and financial applications · Advanced Bandit Algorithms Research · Risk and Portfolio Optimization