Harnessing Language for Coordination: A Framework and Benchmark for   LLM-Driven Multi-Agent Control

Timoth\'ee Anne; Noah Syrkis; Meriem Elhosni; Florian Turati; Franck; Legendre; Alain Jaquier; and Sebastian Risi

arXiv:2412.11761·cs.AI·April 24, 2025

Harnessing Language for Coordination: A Framework and Benchmark for LLM-Driven Multi-Agent Control

Timoth\'ee Anne, Noah Syrkis, Meriem Elhosni, Florian Turati, Franck, Legendre, Alain Jaquier, and Sebastian Risi

PDF

Open Access

TL;DR

This paper introduces HIVE, a framework enabling humans to coordinate large swarms of agents via natural language, and a benchmark to evaluate LLMs' multi-agent control capabilities, revealing both potentials and limitations.

Contribution

The paper presents a novel framework HIVE and a real-time strategy benchmark for LLM-driven multi-agent coordination, advancing understanding of language-based control in complex scenarios.

Findings

01

Hybrid approach effectively coordinates agent movements and exploits unit weaknesses.

02

Current models struggle with spatial visual information processing.

03

Challenges exist in formulating long-term strategic plans.

Abstract

Large Language Models (LLMs) have demonstrated remarkable performance across various tasks. Their potential to facilitate human coordination with many agents is a promising but largely under-explored area. Such capabilities would be helpful in disaster response, urban planning, and real-time strategy scenarios. In this work, we introduce (1) a real-time strategy game benchmark designed to evaluate these abilities and (2) a novel framework we term HIVE. HIVE empowers a single human to coordinate swarms of up to 2,000 agents through a natural language dialog with an LLM. We present promising results on this multi-agent benchmark, with our hybrid approach solving tasks such as coordinating agent movements, exploiting unit weaknesses, leveraging human annotations, and understanding terrain and strategic points. Our findings also highlight critical limitations of current models, including…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMulti-Agent Systems and Negotiation