Adaptive Consensus in LLM Ensembles via Sequential Evidence Accumulation: Automatic Budget Identification and Calibrated Commit Signals

Roberto E. Medina

arXiv:2605.04236·cs.LG·May 15, 2026

Adaptive Consensus in LLM Ensembles via Sequential Evidence Accumulation: Automatic Budget Identification and Calibrated Commit Signals

Roberto E. Medina

PDF

TL;DR

This paper introduces DASE, a heuristic for adaptive ensemble deliberation in large language models, improving accuracy by early consensus detection and fallback strategies, with broad applicability across benchmarks.

Contribution

DASE is a novel adaptive stopping heuristic that generalizes across benchmarks, enabling early consensus commitment and improving ensemble reasoning accuracy.

Findings

01

DASE achieves significant routing gaps and accuracy improvements across benchmarks.

02

Adaptive stopping, not bandwidth, primarily drives ensemble accuracy.

03

Injection-based methods show an inverted-U accuracy trajectory, suggesting new hypotheses.

Abstract

Large Language Model ensembles improve reasoning accuracy, but only up to a performance boundary beyond which additional deliberation degrades accuracy. We introduce DASE (Deliberative Adaptive Stopping Ensemble), a stopping heuristic for iterative ensemble deliberation that commits early on genuine consensus and applies a global-frequency fallback on fragmented evidence. We make three contributions. (1) DASE produces a commit-type routing partition that generalises across benchmarks and is complementary to verbalized single-call confidence. On GPQA-Extended (N=546, 70B ensemble), the partition yields a 39.5 pp routing gap (right-wall 81.1% vs. left-wall 41.5%). On AIME 2010-2023 (N=261, 120B ensemble, 3 seeds), right-wall commits reach 98.3% accuracy vs. left-wall 72.8% (25.5 pp gap), statistically equivalent to Opus 4.6 Standard verbalized confidence at matched coverage (25.7 pp gap;…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.