HiveMind: OS-Inspired Scheduling for Concurrent LLM Agent Workloads

Justice Owusu Agyemang; Jerry John Kponyo; Obed Kwasi Somuah; Elliot Amponsah; Godfred Manu Addo Boakye; Kwame Opuni-Boachie Obour Agyekum

arXiv:2604.17111·cs.DC·April 21, 2026

HiveMind: OS-Inspired Scheduling for Concurrent LLM Agent Workloads

Justice Owusu Agyemang, Jerry John Kponyo, Obed Kwasi Somuah, Elliot Amponsah, Godfred Manu Addo Boakye, Kwame Opuni-Boachie Obour Agyekum

PDF

TL;DR

HIVEMIND is an OS-inspired proxy that manages concurrent LLM agent workloads to prevent resource contention failures, significantly reducing errors and wasted compute without modifying existing agent code.

Contribution

The paper introduces HIVEMIND, a novel proxy applying OS-like scheduling primitives to coordinate LLM API calls, improving reliability and efficiency in multi-agent environments.

Findings

01

Failure rates drop from 72-100% to 0-18% with HIVEMIND.

02

HIVEMIND reduces wasted compute by up to 100%.

03

Overhead per request is under 3ms, confirming efficiency.

Abstract

When multiple LLM coding agents share a rate-limited API endpoint, they exhibit resource contention patterns analogous to unscheduled OS processes competing for CPU, memory, and I/O. In a motivating incident, 3 of 11 parallel agents died from connection resets and HTTP 502 errors - a 27% failure rate - despite the API having sufficient aggregate capacity to serve all 11 sequentially. We present HIVEMIND, a transparent HTTP proxy that applies five OS-inspired scheduling primitives - admission control, rate-limit tracking, AIMD backpressure with circuit breaking, token budget management, and priority queuing - to eliminate the failure modes caused by uncoordinated parallel execution. The proxy requires zero modifications to existing agent code and supports Anthropic, OpenAI, and local model APIs via auto-detected provider profiles. Our evaluation across seven scenarios (5-50 concurrent…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.