Asynchronous Actor-Critic for Multi-Agent Reinforcement Learning
Yuchen Xiao, Weihao Tan, Christopher Amato

TL;DR
This paper introduces asynchronous multi-agent actor-critic algorithms enabling agents to learn and act independently without synchronization, improving performance in complex multi-agent environments.
Contribution
It develops the first set of asynchronous policy gradient methods for multi-agent reinforcement learning across various training paradigms.
Findings
Algorithms outperform synchronous methods in large multi-agent tasks.
Empirical validation in simulation and hardware shows high-quality asynchronous solutions.
Methods are effective across diverse realistic domains.
Abstract
Synchronizing decisions across multiple agents in realistic settings is problematic since it requires agents to wait for other agents to terminate and communicate about termination reliably. Ideally, agents should learn and execute asynchronously instead. Such asynchronous methods also allow temporally extended actions that can take different amounts of time based on the situation and action executed. Unfortunately, current policy gradient methods are not applicable in asynchronous settings, as they assume that agents synchronously reason about action selection at every time step. To allow asynchronous learning and decision-making, we formulate a set of asynchronous multi-agent actor-critic methods that allow agents to directly optimize asynchronous policies in three standard training paradigms: decentralized learning, centralized learning, and centralized training for decentralized…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsReinforcement Learning in Robotics · Adversarial Robustness in Machine Learning · Auction Theory and Applications
