Bayesian Action Decoder for Deep Multi-Agent Reinforcement Learning

Jakob N. Foerster; Francis Song; Edward Hughes; Neil Burch; Iain; Dunning; Shimon Whiteson; Matthew Botvinick; Michael Bowling

arXiv:1811.01458·cs.MA·September 12, 2019·49 cites

Bayesian Action Decoder for Deep Multi-Agent Reinforcement Learning

Jakob N. Foerster, Francis Song, Edward Hughes, Neil Burch, Iain, Dunning, Shimon Whiteson, Matthew Botvinick, Michael Bowling

PDF

Open Access 1 Repo

TL;DR

The paper introduces Bayesian Action Decoder (BAD), a novel multi-agent reinforcement learning method that uses Bayesian updates to improve cooperation and strategy discovery in complex, partially observable environments.

Contribution

BAD is a new approach that incorporates Bayesian reasoning into multi-agent RL, enabling agents to better infer and communicate private information.

Findings

01

Outperforms policy gradient methods in a two-step matrix game.

02

Achieves state-of-the-art results in Hanabi, surpassing previous learning and hand-coded approaches.

03

Demonstrates effective learning of strategies in complex, partially observable settings.

Abstract

When observing the actions of others, humans make inferences about why they acted as they did, and what this implies about the world; humans also use the fact that their actions will be interpreted in this manner, allowing them to act informatively and thereby communicate efficiently with others. Although learning algorithms have recently achieved superhuman performance in a number of two-player, zero-sum games, scalable multi-agent reinforcement learning algorithms that can discover effective strategies and conventions in complex, partially observable settings have proven elusive. We present the Bayesian action decoder (BAD), a new multi-agent learning method that uses an approximate Bayesian update to obtain a public belief that conditions on the actions taken by all agents in the environment. BAD introduces a new Markov decision process, the public belief MDP, in which the action…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

facebookresearch/jps
pytorch

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Anomaly Detection Techniques and Applications · Artificial Immune Systems Applications