ZEBRA: Zero-shot Budgeted Resource Allocation for LLM Orchestration

May Hamri; Inbal Talgam-Cohen

arXiv:2605.20485·cs.LG·May 21, 2026

ZEBRA: Zero-shot Budgeted Resource Allocation for LLM Orchestration

May Hamri, Inbal Talgam-Cohen

PDF

TL;DR

ZEBRA is a zero-shot framework for effective multi-phase budget allocation in LLM pipelines, outperforming direct LLM-based methods across multiple tasks by solving a nonlinear knapsack problem.

Contribution

It introduces ZEBRA, a novel zero-shot approach that optimally allocates budgets across pipeline phases at inference time, unifying additive and multiplicative aggregations.

Findings

01

ZEBRA outperforms LLM-direct on APPS coding benchmark across all metrics.

02

At 50% budget, ZEBRA recovers 94.4% of unconstrained quality, higher than LLM-direct's 88.1%.

03

On HotpotQA, ZEBRA improves performance by 14.3 percentage points over LLM-direct.

Abstract

As autonomous agents increasingly execute end-to-end tasks under fixed monetary budgets, the pressing open question shifts from whether the budget is respected, to how to spend it effectively. Existing budget-aware methods typically control reasoning step-by-step within a single agent, or learn resource allocation policies via RL. None address how to split a budget across the composing phases of a multi-agent pipeline at inference time. We propose ZEBRA, a zero-shot framework that reduces multi-phase budget allocation to a continuous nonlinear knapsack problem: an LLM controller estimates per-phase utility curves, and a water-filling search on the Lagrange multiplier returns the per-phase split. Additive and multiplicative aggregations are unified under the same solver. On a $150$ -task APPS coding benchmark, both ZEBRA variants outperform LLM-direct (budget allocation directly by an…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.