DeCOM: Decomposed Policy for Constrained Cooperative Multi-Agent Reinforcement Learning
Zhaoxing Yang, Rong Ding, Haiming Jin, Yifei Wei, Haoyi You, Guiyun, Fan, Xiaoying Gan, Xinbing Wang

TL;DR
DeCOM introduces a decomposed policy framework for constrained cooperative multi-agent reinforcement learning, enabling scalable, efficient optimization under constraints with theoretical guarantees and validation in large-scale environments.
Contribution
DeCOM's modular policy decomposition and iterative optimization approach address constraints in cooperative MARL, offering scalability and convergence guarantees.
Findings
Effective in toy environments with various costs
Scalable to large environments with 500 agents
Converges reliably in theoretical analysis
Abstract
In recent years, multi-agent reinforcement learning (MARL) has presented impressive performance in various applications. However, physical limitations, budget restrictions, and many other factors usually impose \textit{constraints} on a multi-agent system (MAS), which cannot be handled by traditional MARL frameworks. Specifically, this paper focuses on constrained MASes where agents work \textit{cooperatively} to maximize the expected team-average return under various constraints on expected team-average costs, and develops a \textit{constrained cooperative MARL} framework, named DeCOM, for such MASes. In particular, DeCOM decomposes the policy of each agent into two modules, which empowers information sharing among agents to achieve better cooperation. In addition, with such modularization, the training algorithm of DeCOM separates the original constrained optimization into an…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsReinforcement Learning in Robotics · Auction Theory and Applications · Supply Chain and Inventory Management
