CM3: Cooperative Multi-goal Multi-stage Multi-agent Reinforcement Learning
Jiachen Yang, Alireza Nakhaei, David Isele, Kikuo Fujimura, Hongyuan, Zha

TL;DR
CM3 introduces a novel two-stage curriculum and credit assignment mechanism to improve cooperative multi-goal multi-agent reinforcement learning, enabling faster learning in complex multi-agent tasks.
Contribution
The paper proposes a new multi-goal multi-agent policy gradient with a credit function and a curriculum-based training scheme, addressing exploration and credit assignment challenges.
Findings
CM3 learns faster than existing algorithms.
Effective in complex multi-goal multi-agent environments.
Demonstrates success in navigation, traffic, and game scenarios.
Abstract
A variety of cooperative multi-agent control problems require agents to achieve individual goals while contributing to collective success. This multi-goal multi-agent setting poses difficulties for recent algorithms, which primarily target settings with a single global reward, due to two new challenges: efficient exploration for learning both individual goal attainment and cooperation for others' success, and credit-assignment for interactions between actions and goals of different agents. To address both challenges, we restructure the problem into a novel two-stage curriculum, in which single-agent goal attainment is learned prior to learning multi-agent cooperation, and we derive a new multi-goal multi-agent policy gradient with a credit function for localized credit assignment. We use a function augmentation scheme to bridge value and policy functions across the curriculum. The…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Adaptive Dynamic Programming Control · Distributed Control Multi-Agent Systems
