MA-Dreamer: Coordination and communication through shared imagination
Kenzo Lobos-Tsunekawa, Akshay Srinivasan, Michael Spranger

TL;DR
MA-Dreamer is a model-based multi-agent reinforcement learning method that enables decentralized agents to coordinate and communicate effectively through shared imagination, outperforming existing approaches in complex, partially observable environments.
Contribution
It introduces a novel model-based approach using shared environment models for training decentralized policies and critics via imagination, improving coordination and communication in multi-agent settings.
Findings
MA-Dreamer outperforms other methods in soccer-based cooperative tasks.
It effectively handles long-term speaker-listener and partial-observability challenges.
The method demonstrates robust coordination and communication capabilities.
Abstract
Multi-agent RL is rendered difficult due to the non-stationary nature of environment perceived by individual agents. Theoretically sound methods using the REINFORCE estimator are impeded by its high-variance, whereas value-function based methods are affected by issues stemming from their ad-hoc handling of situations like inter-agent communication. Methods like MADDPG are further constrained due to their requirement of centralized critics etc. In order to address these issues, we present MA-Dreamer, a model-based method that uses both agent-centric and global differentiable models of the environment in order to train decentralized agents' policies and critics using model-rollouts a.k.a `imagination'. Since only the model-training is done off-policy, inter-agent communication/coordination and `language emergence' can be handled in a straight-forward manner. We compare the performance of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMulti-Agent Systems and Negotiation · Opinion Dynamics and Social Influence
Methods*Communicated@Fast*How Do I Communicate to Expedia? · Experience Replay · Adam · Dense Connections · Weight Decay · Batch Normalization · Convolution · REINFORCE · MADDPG
