Orla: A Library for Serving LLM-Based Multi-Agent Systems
Rana Shahout, Hayder Tirmazi, Minlan Yu, Michael Mitzenmacher

TL;DR
Orla is a versatile library that simplifies building and deploying complex multi-agent systems powered by large language models, offering abstraction layers for workflow management, resource allocation, and state handling.
Contribution
It introduces a general abstraction layer for orchestrating multi-stage LLM workflows, separating request execution from workflow policy, and provides mechanisms for stage mapping, scheduling, and memory management.
Findings
Stage mapping improves latency and cost efficiency.
Workflow cache management reduces time-to-first-token.
Demonstrated effectiveness on customer support workflows.
Abstract
We introduce Orla, a library for constructing and running LLM-based agentic systems. Modern agentic applications consist of workflows that combine multiple LLM inference steps, tool calls, and heterogeneous infrastructure. Today, developers typically build these systems by manually composing orchestration code with LLM serving engines and tool execution logic. Orla provides a general abstraction that separates request execution from workflow-level policy. It acts as a serving layer above existing LLM inference engines: developers define workflows composed of stages, while Orla manages how those stages are mapped, executed, and coordinated across models and backends. It provides agent-level control through three mechanisms: a stage mapper, which assigns each stage to an appropriate model and backend; a workflow orchestrator, which schedules stages and manages their resources and context;…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMulti-Agent Systems and Negotiation · Scientific Computing and Data Management · Semantic Web and Ontologies
