Convergence of Batch Asynchronous Stochastic Approximation With Applications to Reinforcement Learning

Rajeeva L. Karandikar; M. Vidyasagar

arXiv:2109.03445·stat.ML·September 10, 2025·1 cites

Convergence of Batch Asynchronous Stochastic Approximation With Applications to Reinforcement Learning

Rajeeva L. Karandikar, M. Vidyasagar

PDF

Open Access

TL;DR

This paper studies the convergence properties of Block Asynchronous Stochastic Approximation (BASA), a generalization of stochastic approximation methods, with applications to reinforcement learning, providing conditions and bounds for convergence and rate.

Contribution

It introduces a unified framework for analyzing both synchronous and asynchronous stochastic approximation, including new convergence conditions and rate bounds for BASA.

Findings

01

Established convergence conditions for BASA.

02

Derived bounds on the rate of convergence.

03

Applied results to reinforcement learning fixed point problems.

Abstract

We begin by briefly surveying some results on the convergence of the Stochastic Gradient Descent (SGD) Method, proved in a companion paper by the present authors. These results are based on viewing SGD as a version of Stochastic Approximation (SA). Ever since its introduction in the classic paper of Robbins and Monro in 1951, SA has become a standard tool for finding a solution of an equation of the form $f (θ) = 0$ , when only noisy measurements of $f (\cdot)$ are available. In most situations, \textit{every component} of the putative solution $θ_{t}$ is updated at each step $t$ . In some applications in Reinforcement Learning (RL), \textit{only one component} of $θ_{t}$ is updated at each $t$ . This is known as \textbf{asynchronous} SA. In this paper, we study \textbf{Block Asynchronous SA (BASA)}, in which, at each step $t$ , \textit{some but not necessarily all} components of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Advanced Bandit Algorithms Research · Sparse and Compressive Sensing Techniques