Enhancing Multi-Agent Collaboration with Attention-Based Actor-Critic Policies
Hugo Garrido-Lestache Belinchon, Jeremy Kedziora

TL;DR
This paper presents TAAC, a novel reinforcement learning algorithm with attention mechanisms for improved multi-agent collaboration, demonstrating superior performance in simulated soccer tasks compared to existing methods.
Contribution
Introduces TAAC, integrating multi-headed attention in actor-critic frameworks for dynamic inter-agent communication and role diversity in cooperative multi-agent systems.
Findings
TAAC outperforms benchmark algorithms in simulated soccer.
Enhanced collaborative behaviors observed with TAAC.
TAAC achieves higher win rates and better tactical coordination.
Abstract
This paper introduces Team-Attention-Actor-Critic (TAAC), a reinforcement learning algorithm designed to enhance multi-agent collaboration in cooperative environments. TAAC employs a Centralized Training/Centralized Execution scheme incorporating multi-headed attention mechanisms in both the actor and critic. This design facilitates dynamic, inter-agent communication, allowing agents to explicitly query teammates, thereby efficiently managing the exponential growth of joint-action spaces while ensuring a high degree of collaboration. We further introduce a penalized loss function which promotes diverse yet complementary roles among agents. We evaluate TAAC in a simulated soccer environment against benchmark algorithms representing other multi-agent paradigms, including Proximal Policy Optimization and Multi-Agent Actor-Attention-Critic. We find that TAAC exhibits superior performance…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComplex Systems and Decision Making · Team Dynamics and Performance · Business Process Modeling and Analysis
