Finite-Time Analysis of Asynchronous Q-learning under Diminishing   Step-Size from Control-Theoretic View

Han-Dong Lim; Donghwan Lee

arXiv:2207.12217·cs.AI·July 26, 2022

Finite-Time Analysis of Asynchronous Q-learning under Diminishing Step-Size from Control-Theoretic View

Han-Dong Lim, Donghwan Lee

PDF

Open Access

TL;DR

This paper provides a finite-time convergence analysis of asynchronous Q-learning with diminishing step-sizes, using a control-theoretic approach that improves existing results and offers new insights into the algorithm's behavior.

Contribution

It introduces a novel switching system model for Q-learning with diminishing step-sizes, achieving improved convergence rates and simplifying the analysis process.

Findings

01

Achieves ( rac{\

02

Provides a new control-system perspective on Q-learning analysis.

03

Demonstrates ( rac{\

Abstract

Q-learning has long been one of the most popular reinforcement learning algorithms, and theoretical analysis of Q-learning has been an active research topic for decades. Although researches on asymptotic convergence analysis of Q-learning have a long tradition, non-asymptotic convergence has only recently come under active study. The main goal of this paper is to investigate new finite-time analysis of asynchronous Q-learning under Markovian observation models via a control system viewpoint. In particular, we introduce a discrete-time time-varying switching system model of Q-learning with diminishing step-sizes for our analysis, which significantly improves recent development of the switching system analysis with constant step-sizes, and leads to \(\mathcal{O}\left( \sqrt{\frac{\log k}{k}} \right)\) convergence rate that is comparable to or better than most of the state of the art…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAnalog and Mixed-Signal Circuit Design · Age of Information Optimization · Stability and Control of Uncertain Systems

MethodsQ-Learning