Unsynchronized Decentralized Q-Learning: Two Timescale Analysis By   Persistence

Bora Yongacoglu; G\"urdal Arslan; Serdar Y\"uksel

arXiv:2308.03239·cs.GT·March 19, 2025·1 cites

Unsynchronized Decentralized Q-Learning: Two Timescale Analysis By Persistence

Bora Yongacoglu, G\"urdal Arslan, Serdar Y\"uksel

PDF

Open Access

TL;DR

This paper analyzes an unsynchronized decentralized Q-learning algorithm in multi-agent reinforcement learning, demonstrating it can converge to equilibrium without the need for synchronized policy updates, thus broadening its practical applicability.

Contribution

It introduces a high-probability convergence analysis for an unsynchronized variant of decentralized Q-learning using constant learning rates, relaxing previous synchronization assumptions.

Findings

01

Convergence to equilibrium under high probability

02

Constant learning rates are critical for analysis

03

Applicable to a range of decentralized algorithms

Abstract

Non-stationarity is a fundamental challenge in multi-agent reinforcement learning (MARL), where agents update their behaviour as they learn. Many theoretical advances in MARL avoid the challenge of non-stationarity by coordinating the policy updates of agents in various ways, including synchronizing times at which agents are allowed to revise their policies. Synchronization enables analysis of many MARL algorithms via multi-timescale methods, but such synchronization is infeasible in many decentralized applications. In this paper, we study an unsynchronized variant of the decentralized Q-learning algorithm, a recent MARL algorithm for stochastic games. We provide sufficient conditions under which the unsynchronized algorithm drives play to equilibrium with high probability. Our solution utilizes constant learning rates in the Q-factor update, which we show to be critical for relaxing…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGame Theory and Applications · Age of Information Optimization · Reinforcement Learning in Robotics

MethodsQ-Learning