Hive: A Multi-Agent Infrastructure for Algorithm- and Task-Level Scaling
Zizhang Luo, Yuhao Luo, Youwei Xiao, Yansong Xu, Runlin Guo, Yun Liang

TL;DR
Hive introduces a multi-agent infrastructure that enhances algorithm- and task-level scaling of large language models by optimizing resource allocation and reducing redundant computation.
Contribution
The paper presents Hive, a novel infrastructure with mechanisms for efficient resource management and redundancy reduction in multi-agent, scalable language model systems.
Findings
Logits Cache achieves 1.11x-1.76x speedup in re-sampling.
Agent-Aware Scheduling reduces hotspot miss rate by 33%-51%.
Hive enables scalable and efficient multi-agent language model deployment.
Abstract
Large language models are increasingly deployed as complex agentic systems that scale with task complexity. While prior work has extensively explored model- and system-level scaling, algorithm- and task-level scaling remain largely unaddressed, constraining the full potential of agentic systems. At the algorithm level, allocating additional inference-time computation can enhance workflow capacity but introduces cross-path redundancy: overlapping computations across multiple reasoning branches. At the task level, complex tasks can be decomposed into subproblems and delegated across multiple agents for improved scalability and parallelism. However, existing infrastructures' scheduling is unaware of the existence of multiple agents, missing opportunities to optimize resource allocation. We propose Hive, a multi-agent infrastructure that enables algorithm- and task-level scaling. Hive…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
