Local Advantage Actor-Critic for Robust Multi-Agent Deep Reinforcement   Learning

Yuchen Xiao; Xueguang Lyu; Christopher Amato

arXiv:2110.08642·cs.LG·December 21, 2021·1 cites

Local Advantage Actor-Critic for Robust Multi-Agent Deep Reinforcement Learning

Yuchen Xiao, Xueguang Lyu, Christopher Amato

PDF

Open Access

TL;DR

The paper introduces ROLA, a robust multi-agent policy gradient method that uses local critics and centralized training to reduce variance and improve credit assignment, demonstrating superior performance across benchmarks.

Contribution

It proposes ROLA, a novel multi-agent policy gradient algorithm with local critics and centralized training, enhancing robustness and efficiency in multi-agent reinforcement learning.

Findings

01

ROLA outperforms state-of-the-art algorithms on various benchmarks.

02

The method effectively reduces variance in policy gradient estimates.

03

ROLA demonstrates robustness to environmental stochasticity and non-stationarity.

Abstract

Policy gradient methods have become popular in multi-agent reinforcement learning, but they suffer from high variance due to the presence of environmental stochasticity and exploring agents (i.e., non-stationarity), which is potentially worsened by the difficulty in credit assignment. As a result, there is a need for a method that is not only capable of efficiently solving the above two problems but also robust enough to solve a variety of tasks. To this end, we propose a new multi-agent policy gradient method, called Robust Local Advantage (ROLA) Actor-Critic. ROLA allows each agent to learn an individual action-value function as a local critic as well as ameliorating environment non-stationarity via a novel centralized training approach based on a centralized critic. By using this local critic, each agent calculates a baseline to reduce variance on its policy gradient estimation,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics