Asynchronous Stochastic Approximation with Applications to Average-Reward Reinforcement Learning
Huizhen Yu, Yi Wan, Richard S. Sutton

TL;DR
This paper advances the theoretical understanding of asynchronous stochastic approximation algorithms, extending stability and convergence guarantees to more general noise conditions, with applications to average-reward reinforcement learning.
Contribution
It extends stability proof methods and analyzes shadowing properties of asynchronous SA, providing a theoretical foundation for average-reward reinforcement learning algorithms.
Findings
Broader convergence guarantees for asynchronous SA.
Extended stability proof under general noise conditions.
Theoretical foundation for average-reward RL algorithms.
Abstract
This paper investigates the stability and convergence properties of asynchronous stochastic approximation (SA) algorithms, with a focus on extensions relevant to average-reward reinforcement learning. We first extend a stability proof method of Borkar and Meyn to accommodate more general noise conditions than previously considered, thereby yielding broader convergence guarantees for asynchronous SA. To sharpen the convergence analysis, we further examine the shadowing properties of asynchronous SA, building on a dynamical systems approach of Hirsch and Bena\"{i}m. These results provide a theoretical foundation for a class of relative value iteration-based reinforcement learning algorithms -- developed and analyzed in a companion paper -- for solving average-reward Markov and semi-Markov decision processes.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsElevator Systems and Control · Traffic control and management
