Inducing Cooperation via Team Regret Minimization based Multi-Agent Deep   Reinforcement Learning

Runsheng Yu; Zhenyu Shi; Xinrun Wang; Rundong Wang; Buhong Liu; Xinwen; Hou; Hanjiang Lai; Bo An

arXiv:1911.07712·cs.AI·November 19, 2019·1 cites

Inducing Cooperation via Team Regret Minimization based Multi-Agent Deep Reinforcement Learning

Runsheng Yu, Zhenyu Shi, Xinrun Wang, Rundong Wang, Buhong Liu, Xinwen, Hou, Hanjiang Lai, Bo An

PDF

Open Access

TL;DR

This paper introduces a novel team regret minimization approach for multi-agent deep reinforcement learning, improving cooperation in partially observable environments through decentralized policies and state estimation.

Contribution

It proposes a new team regret minimization method, a way to decompose team regret for decentralized execution, and employs a differential particle filter for better state estimation.

Findings

01

Outperforms state-of-the-art methods in cooperative and mixed games.

02

Effective in partially observable environments.

03

Enhances sample efficiency and cooperation among agents.

Abstract

Existing value-factorized based Multi-Agent deep Reinforce-ment Learning (MARL) approaches are well-performing invarious multi-agent cooperative environment under thecen-tralized training and decentralized execution(CTDE) scheme,where all agents are trained together by the centralized valuenetwork and each agent execute its policy independently. How-ever, an issue remains open: in the centralized training process,when the environment for the team is partially observable ornon-stationary, i.e., the observation and action informationof all the agents cannot represent the global states, existingmethods perform poorly and sample inefficiently. Regret Min-imization (RM) can be a promising approach as it performswell in partially observable and fully competitive settings.However, it tends to model others as opponents and thus can-not work well under the CTDE scheme. In this work, wepropose a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Artificial Intelligence in Games · Sports Analytics and Performance