MA-Dreamer: Coordination and communication through shared imagination

Kenzo Lobos-Tsunekawa; Akshay Srinivasan; Michael Spranger

arXiv:2204.04687·cs.LG·April 12, 2022·1 cites

MA-Dreamer: Coordination and communication through shared imagination

Kenzo Lobos-Tsunekawa, Akshay Srinivasan, Michael Spranger

PDF

Open Access

TL;DR

MA-Dreamer is a model-based multi-agent reinforcement learning method that enables decentralized agents to coordinate and communicate effectively through shared imagination, outperforming existing approaches in complex, partially observable environments.

Contribution

It introduces a novel model-based approach using shared environment models for training decentralized policies and critics via imagination, improving coordination and communication in multi-agent settings.

Findings

01

MA-Dreamer outperforms other methods in soccer-based cooperative tasks.

02

It effectively handles long-term speaker-listener and partial-observability challenges.

03

The method demonstrates robust coordination and communication capabilities.

Abstract

Multi-agent RL is rendered difficult due to the non-stationary nature of environment perceived by individual agents. Theoretically sound methods using the REINFORCE estimator are impeded by its high-variance, whereas value-function based methods are affected by issues stemming from their ad-hoc handling of situations like inter-agent communication. Methods like MADDPG are further constrained due to their requirement of centralized critics etc. In order to address these issues, we present MA-Dreamer, a model-based method that uses both agent-centric and global differentiable models of the environment in order to train decentralized agents' policies and critics using model-rollouts a.k.a `imagination'. Since only the model-training is done off-policy, inter-agent communication/coordination and `language emergence' can be handled in a straight-forward manner. We compare the performance of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMulti-Agent Systems and Negotiation · Opinion Dynamics and Social Influence

Methods*Communicated@Fast*How Do I Communicate to Expedia? · Experience Replay · Adam · Dense Connections · Weight Decay · Batch Normalization · Convolution · REINFORCE · MADDPG