NVIF: Neighboring Variational Information Flow for Large-Scale Cooperative Multi-Agent Scenarios
Jiajun Chai, Yuanheng Zhu, Dongbin Zhao

TL;DR
This paper introduces NVIF, a communication protocol using variational auto-encoders for large-scale multi-agent reinforcement learning, enabling scalable cooperation and outperforming existing methods.
Contribution
The paper proposes NVIF, a task-independent, pre-trainable communication method using variational auto-encoders, combined with PPO and DQN for improved large-scale multi-agent cooperation.
Findings
NVIF outperforms existing methods in large-scale environments.
NVIF enables scalable and effective cooperation strategies.
Pre-training stabilizes MARL training with NVIF.
Abstract
Communication-based multi-agent reinforcement learning (MARL) provides information exchange between agents, which promotes the cooperation. However, existing methods cannot perform well in the large-scale multi-agent system. In this paper, we adopt neighboring communication and propose a Neighboring Variational Information Flow (NVIF) to provide efficient communication for agents. It employs variational auto-encoder to compress the shared information into a latent state. This communication protocol does not rely dependently on a specific task, so that it can be pre-trained to stabilize the MARL training. Besides. we combine NVIF with Proximal Policy Optimization (NVIF-PPO) and Deep Q Network (NVIF-DQN), and present a theoretical analysis to illustrate NVIF-PPO can promote cooperation. We evaluate the NVIF-PPO and NVIF-DQN on MAgent, a widely used large-scale multi-agent environment, by…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics
