Asymptotically Optimal Sequential Testing with Heterogeneous LLMs

Guokai Li; Alys Liang; Mo Liu; Murray Lei; Stefanus Jasin; Fenghua Yang; Preet Baxi

arXiv:2604.01086·cs.DS·April 3, 2026

Asymptotically Optimal Sequential Testing with Heterogeneous LLMs

Guokai Li, Alys Liang, Mo Liu, Murray Lei, Stefanus Jasin, Fenghua Yang, Preet Baxi

PDF

TL;DR

This paper develops an asymptotically optimal Bayesian sequential testing strategy for multiple heterogeneous large language models, balancing query and waiting costs, and demonstrating that at most two models are needed for optimality as error tolerance diminishes.

Contribution

It introduces belief-dependent policies that optimally combine two LLMs for hypothesis testing, extending previous single-model approaches and proving asymptotic optimality.

Findings

01

Optimal policies use at most two LLMs asymptotically.

02

Single-LLM policies are not generally optimal under asymmetry.

03

Constructed policies match the lower bound up to a small factor as error decreases.

Abstract

We study a Bayesian binary sequential hypothesis testing problem with multiple large language models (LLMs). Each LLM $j$ has per-query cost $c_{j} > 0$ , random waiting time with mean $μ_{j} > 0$ and sub-Gaussian tails, and \emph{asymmetric} accuracies: the probability of returning the correct label depends on the true hypothesis $θ \in {A, B}$ and needs not be the same under $A$ and $B$ . This asymmetry induces two distinct information rates $(I_{j, A}, I_{j, B})$ per LLM, one under each hypothesis. The decision-maker chooses LLMs sequentially, observes their noisy binary answers, and stops when the posterior probability of one hypothesis exceeds $1 - α$ . The objective is to minimize the sum of expected query cost and expected waiting cost, $E [C_{π}] + E [g (W_{π})]$ , where $C_{π}$ is the total query cost, $W_{π}$ is the total waiting time and $g$ is a polynomial function…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.