Heterogeneous-Agent Reinforcement Learning

Yifan Zhong; Jakub Grudzien Kuba; Xidong Feng; Siyi Hu; Jiaming Ji,; and Yaodong Yang

arXiv:2304.09870·cs.LG·December 29, 2023·22 cites

Heterogeneous-Agent Reinforcement Learning

Yifan Zhong, Jakub Grudzien Kuba, Xidong Feng, Siyi Hu, Jiaming Ji,, and Yaodong Yang

PDF

Open Access 1 Repo

TL;DR

This paper introduces a new framework for cooperative multi-agent reinforcement learning with heterogeneous agents, providing algorithms with theoretical guarantees and demonstrating superior performance over existing methods.

Contribution

The paper proposes Heterogeneous-Agent Reinforcement Learning (HARL) algorithms, including HATRL, HATRPO, HAPPO, and the Heterogeneous-Agent Mirror Learning framework, with proven convergence and improved stability.

Findings

01

HARL algorithms outperform baselines like MAPPO and QMIX.

02

Theoretical guarantees include monotonic improvement and convergence to Nash Equilibrium.

03

Heterogeneous agents can be effectively coordinated with the proposed methods.

Abstract

The necessity for cooperation among intelligent machines has popularised cooperative multi-agent reinforcement learning (MARL) in AI research. However, many research endeavours heavily rely on parameter sharing among agents, which confines them to only homogeneous-agent setting and leads to training instability and lack of convergence guarantees. To achieve effective cooperation in the general heterogeneous-agent setting, we propose Heterogeneous-Agent Reinforcement Learning (HARL) algorithms that resolve the aforementioned issues. Central to our findings are the multi-agent advantage decomposition lemma and the sequential update scheme. Based on these, we develop the provably correct Heterogeneous-Agent Trust Region Learning (HATRL), and derive HATRPO and HAPPO by tractable approximations. Furthermore, we discover a novel framework named Heterogeneous-Agent Mirror Learning (HAML),…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

pku-marl/harl
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Adversarial Robustness in Machine Learning · Mobile Crowdsensing and Crowdsourcing

MethodsTest