On Centralized Critics in Multi-Agent Reinforcement Learning
Xueguang Lyu, Andrea Baisero, Yuchen Xiao, Brett Daley, Christopher, Amato

TL;DR
This paper provides a theoretical and empirical analysis of centralized critics in multi-agent reinforcement learning, revealing potential drawbacks of state-based critics and their impact on learning under partial observability.
Contribution
It offers a formal analysis showing that centralized critics are not always beneficial and that state-based critics can introduce bias and variance, challenging common assumptions.
Findings
Centralized critics are not always advantageous in MARL.
State-based critics can cause bias and variance issues.
Practical experiments highlight challenges with partial observability.
Abstract
Centralized Training for Decentralized Execution where agents are trained offline in a centralized fashion and execute online in a decentralized manner, has become a popular approach in Multi-Agent Reinforcement Learning (MARL). In particular, it has become popular to develop actor-critic methods that train decentralized actors with a centralized critic where the centralized critic is allowed access global information of the entire system, including the true system state. Such centralized critics are possible given offline information and are not used for online execution. While these methods perform well in a number of domains and have become a de facto standard in MARL, using a centralized critic in this context has yet to be sufficiently analyzed theoretically or empirically. In this paper, we therefore formally analyze centralized and decentralized critic approaches, and analyze the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsOpinion Dynamics and Social Influence
