NVIF: Neighboring Variational Information Flow for Large-Scale   Cooperative Multi-Agent Scenarios

Jiajun Chai; Yuanheng Zhu; Dongbin Zhao

arXiv:2207.00964·cs.MA·July 5, 2022·1 cites

NVIF: Neighboring Variational Information Flow for Large-Scale Cooperative Multi-Agent Scenarios

Jiajun Chai, Yuanheng Zhu, Dongbin Zhao

PDF

Open Access

TL;DR

This paper introduces NVIF, a communication protocol using variational auto-encoders for large-scale multi-agent reinforcement learning, enabling scalable cooperation and outperforming existing methods.

Contribution

The paper proposes NVIF, a task-independent, pre-trainable communication method using variational auto-encoders, combined with PPO and DQN for improved large-scale multi-agent cooperation.

Findings

01

NVIF outperforms existing methods in large-scale environments.

02

NVIF enables scalable and effective cooperation strategies.

03

Pre-training stabilizes MARL training with NVIF.

Abstract

Communication-based multi-agent reinforcement learning (MARL) provides information exchange between agents, which promotes the cooperation. However, existing methods cannot perform well in the large-scale multi-agent system. In this paper, we adopt neighboring communication and propose a Neighboring Variational Information Flow (NVIF) to provide efficient communication for agents. It employs variational auto-encoder to compress the shared information into a latent state. This communication protocol does not rely dependently on a specific task, so that it can be pre-trained to stabilize the MARL training. Besides. we combine NVIF with Proximal Policy Optimization (NVIF-PPO) and Deep Q Network (NVIF-DQN), and present a theoretical analysis to illustrate NVIF-PPO can promote cooperation. We evaluate the NVIF-PPO and NVIF-DQN on MAgent, a widely used large-scale multi-agent environment, by…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics