Final Iteration Convergence Bound of Q-Learning: Switching System   Approach

Donghwna Lee

arXiv:2205.05455·eess.SY·January 17, 2024

Final Iteration Convergence Bound of Q-Learning: Switching System Approach

Donghwna Lee

PDF

Open Access

TL;DR

This paper establishes a finite-time convergence bound for the final iterate of Q-learning using a switching system approach, addressing limitations of prior averaged-iterate bounds and offering new insights into RL algorithm analysis.

Contribution

It introduces a finite-time error bound for the final iterate of Q-learning based on a switching system framework, expanding analysis beyond averaged iterates.

Findings

01

Finite-time error bound for Q-learning's final iterate.

02

Analysis covers different scenarios compared to previous work.

03

Provides insights connecting Q-learning with discrete-time switching systems.

Abstract

Q-learning is known as one of the fundamental reinforcement learning (RL) algorithms. Its convergence has been the focus of extensive research over the past several decades. Recently, a new finitetime error bound and analysis for Q-learning was introduced using a switching system framework. This approach views the dynamics of Q-learning as a discrete-time stochastic switching system. The prior study established a finite-time error bound on the averaged iterates using Lyapunov functions, offering further insights into Q-learning. While valuable, the analysis focuses on error bounds of the averaged iterate, which comes with the inherent disadvantages: it necessitates extra averaging steps, which can decelerate the convergence rate. Moreover, the final iterate, being the original format of Q-learning, is more commonly used and is often regarded as a more intuitive and natural form in the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications · Machine Learning and ELM · Fault Detection and Control Systems

MethodsQ-Learning