ZEBRA: Zero-shot Budgeted Resource Allocation for LLM Orchestration
May Hamri, Inbal Talgam-Cohen

TL;DR
ZEBRA is a zero-shot framework for effective multi-phase budget allocation in LLM pipelines, outperforming direct LLM-based methods across multiple tasks by solving a nonlinear knapsack problem.
Contribution
It introduces ZEBRA, a novel zero-shot approach that optimally allocates budgets across pipeline phases at inference time, unifying additive and multiplicative aggregations.
Findings
ZEBRA outperforms LLM-direct on APPS coding benchmark across all metrics.
At 50% budget, ZEBRA recovers 94.4% of unconstrained quality, higher than LLM-direct's 88.1%.
On HotpotQA, ZEBRA improves performance by 14.3 percentage points over LLM-direct.
Abstract
As autonomous agents increasingly execute end-to-end tasks under fixed monetary budgets, the pressing open question shifts from whether the budget is respected, to how to spend it effectively. Existing budget-aware methods typically control reasoning step-by-step within a single agent, or learn resource allocation policies via RL. None address how to split a budget across the composing phases of a multi-agent pipeline at inference time. We propose ZEBRA, a zero-shot framework that reduces multi-phase budget allocation to a continuous nonlinear knapsack problem: an LLM controller estimates per-phase utility curves, and a water-filling search on the Lagrange multiplier returns the per-phase split. Additive and multiplicative aggregations are unified under the same solver. On a -task APPS coding benchmark, both ZEBRA variants outperform LLM-direct (budget allocation directly by an…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
