A finite time analysis of distributed Q-learning

Han-Dong Lim; Donghwan Lee

arXiv:2405.14078·cs.AI·July 30, 2025

A finite time analysis of distributed Q-learning

Han-Dong Lim, Donghwan Lee

PDF

Open Access

TL;DR

This paper provides a finite-time analysis and new sample complexity bounds for distributed Q-learning in multi-agent reinforcement learning, demonstrating how multiple agents can cooperatively learn optimal policies without centralized reward access.

Contribution

It introduces a finite-time analysis for distributed Q-learning and derives novel sample complexity bounds under tabular settings, advancing theoretical understanding of multi-agent RL algorithms.

Findings

01

New sample complexity bounds for distributed Q-learning

02

Finite-time convergence guarantees in multi-agent settings

03

Analysis under tabular lookup with cooperative agents

Abstract

Multi-agent reinforcement learning (MARL) has witnessed a remarkable surge in interest, fueled by the empirical success achieved in applications of single-agent reinforcement learning (RL). In this study, we consider a distributed Q-learning scenario, wherein a number of agents cooperatively solve a sequential decision making problem without access to the central reward function which is an average of the local rewards. In particular, we study finite-time analysis of a distributed Q-learning algorithm, and provide a new sample complexity result of $\tilde{O} (min {\frac{1}{ϵ ^{2}} \frac{t _{mix}}{( 1 - γ ) ^{6} d _{m i n}^{4}}, \frac{1}{ϵ} \frac{∣ \gS ∣∣ \gA ∣}{( 1 - σ _{2} ( W )) ( 1 - γ ) ^{4} d _{m i n}^{3}}})$ under tabular lookup

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDistributed Sensor Networks and Detection Algorithms · Machine Learning and ELM · Neural Networks and Applications

MethodsQ-Learning