Local Advantage Networks for Cooperative Multi-Agent Reinforcement   Learning

Rapha\"el Avalos; Mathieu Reymond; Ann Now\'e; Diederik M. Roijers

arXiv:2112.12458·cs.LG·October 27, 2023

Local Advantage Networks for Cooperative Multi-Agent Reinforcement Learning

Rapha\"el Avalos, Mathieu Reymond, Ann Now\'e, Diederik M. Roijers

PDF

Open Access

TL;DR

This paper introduces Local Advantage Networks (LAN), a novel multi-agent reinforcement learning approach that uses a dueling architecture and centralized critic to improve scalability and performance in cooperative environments.

Contribution

LAN offers a new decentralized policy learning method with a centralized critic, differing from factorized value function approaches, and demonstrates state-of-the-art results on StarCraft II.

Findings

01

LAN achieves state-of-the-art performance on StarCraft II benchmark.

02

LAN is highly scalable with respect to the number of agents.

03

The centralized critic effectively stabilizes learning by reducing the moving target problem.

Abstract

Many recent successful off-policy multi-agent reinforcement learning (MARL) algorithms for cooperative partially observable environments focus on finding factorized value functions, leading to convoluted network structures. Building on the structure of independent Q-learners, our LAN algorithm takes a radically different approach, leveraging a dueling architecture to learn for each agent a decentralized best-response policies via individual advantage functions. The learning is stabilized by a centralized critic whose primary objective is to reduce the moving target problem of the individual advantages. The critic, whose network's size is independent of the number of agents, is cast aside after learning. Evaluation on the StarCraft II multi-agent challenge benchmark shows that LAN reaches state-of-the-art performance and is highly scalable with respect to the number of agents, opening up…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Adaptive Dynamic Programming Control

MethodsConvolution · Q-Learning · Dense Connections · Deep Q-Network