Harnessing Language for Coordination: A Framework and Benchmark for LLM-Driven Multi-Agent Control
Timoth\'ee Anne, Noah Syrkis, Meriem Elhosni, Florian Turati, Franck, Legendre, Alain Jaquier, and Sebastian Risi

TL;DR
This paper introduces HIVE, a framework enabling humans to coordinate large swarms of agents via natural language, and a benchmark to evaluate LLMs' multi-agent control capabilities, revealing both potentials and limitations.
Contribution
The paper presents a novel framework HIVE and a real-time strategy benchmark for LLM-driven multi-agent coordination, advancing understanding of language-based control in complex scenarios.
Findings
Hybrid approach effectively coordinates agent movements and exploits unit weaknesses.
Current models struggle with spatial visual information processing.
Challenges exist in formulating long-term strategic plans.
Abstract
Large Language Models (LLMs) have demonstrated remarkable performance across various tasks. Their potential to facilitate human coordination with many agents is a promising but largely under-explored area. Such capabilities would be helpful in disaster response, urban planning, and real-time strategy scenarios. In this work, we introduce (1) a real-time strategy game benchmark designed to evaluate these abilities and (2) a novel framework we term HIVE. HIVE empowers a single human to coordinate swarms of up to 2,000 agents through a natural language dialog with an LLM. We present promising results on this multi-agent benchmark, with our hybrid approach solving tasks such as coordinating agent movements, exploiting unit weaknesses, leveraging human annotations, and understanding terrain and strategic points. Our findings also highlight critical limitations of current models, including…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMulti-Agent Systems and Negotiation
