Multi-LLM Query Optimization
Arlen Dean, Zijin Zhang, Stefanus Jasin, Yuqing Liu

TL;DR
This paper addresses the challenge of optimally allocating queries across multiple heterogeneous large language models to minimize cost while maintaining error guarantees, introducing a surrogate optimization approach and approximation scheme.
Contribution
It formulates a robust offline query-planning problem, proves its NP-hardness, and develops a surrogate with theoretical guarantees and an efficient approximation algorithm.
Findings
Surrogate-based optimization effectively approximates the true optimal query plan.
The proposed surrogate is asymptotically tight as error tolerances decrease.
An AFPTAS achieves near-optimal solutions within a factor of (1+ε).
Abstract
Deploying multiple large language models (LLMs) in parallel to classify an unknown ground-truth label is a common practice, yet the problem of optimally allocating queries across heterogeneous models remains poorly understood. In this paper, we formulate a robust, offline query-planning problem that minimizes total query cost subject to statewise error constraints which guarantee reliability for every possible ground-truth label. We first establish that this problem is NP-hard via a reduction from the minimum-weight set cover problem. To overcome this intractability, we develop a surrogate by combining a union bound decomposition of the multi-class error into pairwise comparisons with Chernoff-type concentration bounds. The resulting surrogate admits a closed-form, multiplicatively separable expression in the query counts and is guaranteed to be feasibility-preserving. We further show…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Machine Learning and Algorithms · Topic Modeling
