JointPPO: Diving Deeper into the Effectiveness of PPO in Multi-Agent   Reinforcement Learning

Chenxing Liu; Guizhong Liu

arXiv:2404.11831·cs.MA·July 8, 2024·3 cites

JointPPO: Diving Deeper into the Effectiveness of PPO in Multi-Agent Reinforcement Learning

Chenxing Liu, Guizhong Liu

PDF

Open Access

TL;DR

JointPPO introduces a Transformer-based centralized training method for multi-agent reinforcement learning, effectively managing large joint action spaces and outperforming existing baselines in complex environments.

Contribution

It proposes a novel CTCE approach using PPO with a Transformer policy network, transforming joint decision-making into a sequence generation task for better scalability.

Findings

01

JointPPO outperforms strong baselines on SMAC benchmark.

02

Transformer-based policy effectively handles large joint action spaces.

03

Ablation studies highlight key factors influencing performance.

Abstract

While Centralized Training with Decentralized Execution (CTDE) has become the prevailing paradigm in Multi-Agent Reinforcement Learning (MARL), it may not be suitable for scenarios in which agents can fully communicate and share observations with each other. Fully centralized methods, also know as Centralized Training with Centralized Execution (CTCE) methods, can fully utilize observations of all the agents by treating the entire system as a single agent. However, traditional CTCE methods suffer from scalability issues due to the exponential growth of the joint action space. To address these challenges, in this paper we propose JointPPO, a CTCE method that uses Proximal Policy Optimization (PPO) to directly optimize the joint policy of the multi-agent system. JointPPO decomposes the joint policy into conditional probabilities, transforming the decision-making process into a sequence…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTransportation and Mobility Innovations · Advanced Software Engineering Methodologies