MAVEN: Multi-Agent Variational Exploration

Anuj Mahajan; Tabish Rashid; Mikayel Samvelyan; Shimon Whiteson

arXiv:1910.07483·cs.LG·January 22, 2020·76 cites

MAVEN: Multi-Agent Variational Exploration

Anuj Mahajan, Tabish Rashid, Mikayel Samvelyan, Shimon Whiteson

PDF

Open Access 4 Repos

TL;DR

MAVEN introduces a hierarchical multi-agent reinforcement learning method that combines value and policy-based approaches with a shared latent space, enabling effective exploration and improved performance in complex cooperative tasks.

Contribution

The paper proposes MAVEN, a novel hierarchical approach that enhances exploration in multi-agent reinforcement learning by integrating value and policy methods through a shared latent space.

Findings

01

MAVEN outperforms existing methods on the SMAC benchmark.

02

Hierarchical latent control improves exploration in multi-agent environments.

03

Value-based methods like QMIX have limitations in exploration that MAVEN addresses.

Abstract

Centralised training with decentralised execution is an important setting for cooperative deep multi-agent reinforcement learning due to communication constraints during execution and computational tractability in training. In this paper, we analyse value-based methods that are known to have superior performance in complex environments [43]. We specifically focus on QMIX [40], the current state-of-the-art in this domain. We show that the representational constraints on the joint action-values introduced by QMIX and similar methods lead to provably poor exploration and suboptimality. Furthermore, we propose a novel approach called MAVEN that hybridises value and policy-based methods by introducing a latent space for hierarchical control. The value-based agents condition their behaviour on the shared latent variable controlled by a hierarchical policy. This allows MAVEN to achieve…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Adversarial Robustness in Machine Learning · Explainable Artificial Intelligence (XAI)