Factorized Q-Learning for Large-Scale Multi-Agent Systems

Ming Zhou; Yong Chen; Ying Wen; Yaodong Yang; Yufeng Su; Weinan Zhang,; Dell Zhang; Jun Wang

arXiv:1809.03738·cs.MA·October 14, 2019

Factorized Q-Learning for Large-Scale Multi-Agent Systems

Ming Zhou, Yong Chen, Ying Wen, Yaodong Yang, Yufeng Su, Weinan Zhang,, Dell Zhang, Jun Wang

PDF

TL;DR

This paper introduces a factorized Q-learning approach for large-scale multi-agent systems, reducing complexity and improving learning efficiency by approximating the joint Q-function with pairwise interactions and shared neural networks.

Contribution

It proposes a novel tensor factorization method for multi-agent Q-learning, enabling scalable and efficient learning in large multi-agent environments.

Findings

01

Significant performance improvements over baselines in large multi-agent tasks

02

Reduced model complexity and faster learning convergence

03

Effective approximation of joint Q-functions with pairwise factorization

Abstract

Deep Q-learning has achieved significant success in single-agent decision making tasks. However, it is challenging to extend Q-learning to large-scale multi-agent scenarios, due to the explosion of action space resulting from the complex dynamics between the environment and the agents. In this paper, we propose to make the computation of multi-agent Q-learning tractable by treating the Q-function (w.r.t. state and joint-action) as a high-order high-dimensional tensor and then approximate it with factorized pairwise interactions. Furthermore, we utilize a composite deep neural network architecture for computing the factorized Q-function, share the model parameters among all the agents within the same group, and estimate the agents' optimal joint actions through a coordinate descent type algorithm. All these simplifications greatly reduce the model complexity and accelerate the learning…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.