MA2QL: A Minimalist Approach to Fully Decentralized Multi-Agent   Reinforcement Learning

Kefan Su; Siyuan Zhou; Jiechuan Jiang; Chuang Gan; Xiangjun Wang,; Zongqing Lu

arXiv:2209.08244·cs.LG·February 8, 2023·5 cites

MA2QL: A Minimalist Approach to Fully Decentralized Multi-Agent Reinforcement Learning

Kefan Su, Siyuan Zhou, Jiechuan Jiang, Chuang Gan, Xiangjun Wang,, Zongqing Lu

PDF

Open Access

TL;DR

This paper introduces MA2QL, a simple yet theoretically grounded decentralized multi-agent reinforcement learning method where agents alternate Q-learning updates, effectively addressing non-stationarity and outperforming independent Q-learning in cooperative tasks.

Contribution

The paper proposes MA2QL, a minimalist, fully decentralized MARL algorithm with theoretical convergence guarantees, requiring minimal modifications to existing independent Q-learning methods.

Findings

01

MA2QL outperforms independent Q-learning in various cooperative tasks.

02

Agents' alternating updates lead to convergence to Nash equilibrium.

03

Minimal changes to existing Q-learning suffice for effective decentralized learning.

Abstract

Decentralized learning has shown great promise for cooperative multi-agent reinforcement learning (MARL). However, non-stationarity remains a significant challenge in fully decentralized learning. In the paper, we tackle the non-stationarity problem in the simplest and fundamental way and propose multi-agent alternate Q-learning (MA2QL), where agents take turns updating their Q-functions by Q-learning. MA2QL is a minimalist approach to fully decentralized cooperative MARL but is theoretically grounded. We prove that when each agent guarantees $ε$ -convergence at each turn, their joint policy converges to a Nash equilibrium. In practice, MA2QL only requires minimal changes to independent Q-learning (IQL). We empirically evaluate MA2QL on a variety of cooperative multi-agent tasks. Results show MA2QL consistently outperforms IQL, which verifies the effectiveness of MA2QL, despite…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Game Theory and Applications · Distributed Control Multi-Agent Systems

MethodsQ-Learning · Implicit Q-Learning