A finite time analysis of distributed Q-learning
Han-Dong Lim, Donghwan Lee

TL;DR
This paper provides a finite-time analysis and new sample complexity bounds for distributed Q-learning in multi-agent reinforcement learning, demonstrating how multiple agents can cooperatively learn optimal policies without centralized reward access.
Contribution
It introduces a finite-time analysis for distributed Q-learning and derives novel sample complexity bounds under tabular settings, advancing theoretical understanding of multi-agent RL algorithms.
Findings
New sample complexity bounds for distributed Q-learning
Finite-time convergence guarantees in multi-agent settings
Analysis under tabular lookup with cooperative agents
Abstract
Multi-agent reinforcement learning (MARL) has witnessed a remarkable surge in interest, fueled by the empirical success achieved in applications of single-agent reinforcement learning (RL). In this study, we consider a distributed Q-learning scenario, wherein a number of agents cooperatively solve a sequential decision making problem without access to the central reward function which is an average of the local rewards. In particular, we study finite-time analysis of a distributed Q-learning algorithm, and provide a new sample complexity result of under tabular lookup
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDistributed Sensor Networks and Detection Algorithms · Machine Learning and ELM · Neural Networks and Applications
MethodsQ-Learning
