Hierarchical Strategies for Cooperative Multi-Agent Reinforcement Learning
Majd Ibrahim, Ammar Fayad

TL;DR
This paper introduces a hierarchical multi-agent reinforcement learning framework that predicts agent behaviors and plans strategies, achieving state-of-the-art results on complex tasks like StarCraft II and Google Research Football.
Contribution
It proposes a novel two-level hierarchical architecture with a latent policy and information-theoretic objectives, enabling agents to learn and coordinate strategies effectively.
Findings
Achieved over 95% win rate on full Google Research Football game.
First MARL algorithm to solve all StarCraft II super hard scenarios.
Outperformed all existing methods on benchmark tasks.
Abstract
Adequate strategizing of agents behaviors is essential to solving cooperative MARL problems. One intuitively beneficial yet uncommon method in this domain is predicting agents future behaviors and planning accordingly. Leveraging this point, we propose a two-level hierarchical architecture that combines a novel information-theoretic objective with a trajectory prediction model to learn a strategy. To this end, we introduce a latent policy that learns two types of latent strategies: individual , and relational using a modified Graph Attention Network module to extract interaction features. We encourage each agent to behave according to the strategy by conditioning its local functions on , and we further equip agents with a shared function that conditions on . Additionally, we introduce two regularizers to allow predicted trajectories to be accurate and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSports Analytics and Performance · Digital Games and Media · Artificial Intelligence in Games
MethodsSix Ways To Communicate To Someone At Expedia Via Phone And Email's. · *Communicated@Fast*How Do I Communicate to Expedia? · Dense Connections · 1x1 Convolution · Feedforward Network · Two Time-scale Update Rule · Projection Discriminator · Non-Local Operation · Adam · Non-Local Block
