A Deeper Understanding of State-Based Critics in Multi-Agent   Reinforcement Learning

Xueguang Lyu; Andrea Baisero; Yuchen Xiao; Christopher Amato

arXiv:2201.01221·cs.LG·May 26, 2022·1 cites

A Deeper Understanding of State-Based Critics in Multi-Agent Reinforcement Learning

Xueguang Lyu, Andrea Baisero, Yuchen Xiao, Christopher Amato

PDF

Open Access 1 Video

TL;DR

This paper investigates the theoretical properties of state-based critics in multi-agent reinforcement learning, revealing potential biases and variance issues, and empirically evaluates their impact across various benchmarks.

Contribution

It provides the first theoretical analysis of state-based critics, highlighting bias and variance issues, and empirically assesses their practical effects in multi-agent settings.

Findings

01

State-based critics can introduce bias in policy gradient estimates.

02

Using state-based critics can increase gradient variance.

03

Environmental properties influence the effectiveness of different critic types.

Abstract

Centralized Training for Decentralized Execution, where training is done in a centralized offline fashion, has become a popular solution paradigm in Multi-Agent Reinforcement Learning. Many such methods take the form of actor-critic with state-based critics, since centralized training allows access to the true system state, which can be useful during training despite not being available at execution time. State-based critics have become a common empirical choice, albeit one which has had limited theoretical justification or analysis. In this paper, we show that state-based critics can introduce bias in the policy gradient estimates, potentially undermining the asymptotic guarantees of the algorithm. We also show that, even if the state-based critics do not introduce any bias, they can still result in a larger gradient variance, contrary to the common intuition. Finally, we show the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

A Deeper Understanding of State-Based Critics in Multi-Agent Reinforcement Learning· underline

Taxonomy

TopicsAdvanced Memory and Neural Computing · Reinforcement Learning in Robotics