Asymptotically Optimal Sequential Testing with Heterogeneous LLMs
Guokai Li, Alys Liang, Mo Liu, Murray Lei, Stefanus Jasin, Fenghua Yang, Preet Baxi

TL;DR
This paper develops an asymptotically optimal Bayesian sequential testing strategy for multiple heterogeneous large language models, balancing query and waiting costs, and demonstrating that at most two models are needed for optimality as error tolerance diminishes.
Contribution
It introduces belief-dependent policies that optimally combine two LLMs for hypothesis testing, extending previous single-model approaches and proving asymptotic optimality.
Findings
Optimal policies use at most two LLMs asymptotically.
Single-LLM policies are not generally optimal under asymmetry.
Constructed policies match the lower bound up to a small factor as error decreases.
Abstract
We study a Bayesian binary sequential hypothesis testing problem with multiple large language models (LLMs). Each LLM has per-query cost , random waiting time with mean and sub-Gaussian tails, and \emph{asymmetric} accuracies: the probability of returning the correct label depends on the true hypothesis and needs not be the same under and . This asymmetry induces two distinct information rates per LLM, one under each hypothesis. The decision-maker chooses LLMs sequentially, observes their noisy binary answers, and stops when the posterior probability of one hypothesis exceeds . The objective is to minimize the sum of expected query cost and expected waiting cost, , where is the total query cost, is the total waiting time and is a polynomial function…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
