CM3: Cooperative Multi-goal Multi-stage Multi-agent Reinforcement   Learning

Jiachen Yang; Alireza Nakhaei; David Isele; Kikuo Fujimura; Hongyuan; Zha

arXiv:1809.05188·cs.LG·January 28, 2020·23 cites

CM3: Cooperative Multi-goal Multi-stage Multi-agent Reinforcement Learning

Jiachen Yang, Alireza Nakhaei, David Isele, Kikuo Fujimura, Hongyuan, Zha

PDF

Open Access 1 Repo

TL;DR

CM3 introduces a novel two-stage curriculum and credit assignment mechanism to improve cooperative multi-goal multi-agent reinforcement learning, enabling faster learning in complex multi-agent tasks.

Contribution

The paper proposes a new multi-goal multi-agent policy gradient with a credit function and a curriculum-based training scheme, addressing exploration and credit assignment challenges.

Findings

01

CM3 learns faster than existing algorithms.

02

Effective in complex multi-goal multi-agent environments.

03

Demonstrates success in navigation, traffic, and game scenarios.

Abstract

A variety of cooperative multi-agent control problems require agents to achieve individual goals while contributing to collective success. This multi-goal multi-agent setting poses difficulties for recent algorithms, which primarily target settings with a single global reward, due to two new challenges: efficient exploration for learning both individual goal attainment and cooperation for others' success, and credit-assignment for interactions between actions and goals of different agents. To address both challenges, we restructure the problem into a novel two-stage curriculum, in which single-agent goal attainment is learned prior to learning multi-agent cooperation, and we derive a new multi-goal multi-agent policy gradient with a credit function for localized credit assignment. We use a function augmentation scheme to bridge value and policy functions across the curriculum. The…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

011235813/cm3
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Adaptive Dynamic Programming Control · Distributed Control Multi-Agent Systems