Centralized Permutation Equivariant Policy for Cooperative Multi-Agent Reinforcement Learning

Zhuofan Xu; Benedikt Bollig; Matthias F\"ugger; Thomas Nowak; Vincent Le Dr\'eau

arXiv:2508.11706·cs.MA·August 19, 2025

Centralized Permutation Equivariant Policy for Cooperative Multi-Agent Reinforcement Learning

Zhuofan Xu, Benedikt Bollig, Matthias F\"ugger, Thomas Nowak, Vincent Le Dr\'eau

PDF

Open Access

TL;DR

This paper introduces a scalable, permutation-equivariant centralized policy framework for multi-agent reinforcement learning that enhances performance and scalability over traditional decentralized methods.

Contribution

The paper proposes a novel permutation equivariant architecture, GLPE networks, for centralized training in multi-agent RL, improving scalability and performance.

Findings

01

Significant performance improvements on cooperative benchmarks

02

Seamless integration with existing CTDE algorithms

03

Matches state-of-the-art results on RWARE

Abstract

The Centralized Training with Decentralized Execution (CTDE) paradigm has gained significant attention in multi-agent reinforcement learning (MARL) and is the foundation of many recent algorithms. However, decentralized policies operate under partial observability and often yield suboptimal performance compared to centralized policies, while fully centralized approaches typically face scalability challenges as the number of agents increases. We propose Centralized Permutation Equivariant (CPE) learning, a centralized training and execution framework that employs a fully centralized policy to overcome these limitations. Our approach leverages a novel permutation equivariant architecture, Global-Local Permutation Equivariant (GLPE) networks, that is lightweight, scalable, and easy to implement. Experiments show that CPE integrates seamlessly with both value decomposition and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics