CITE: Anytime-Valid Statistical Inference in LLM Self-Consistency

Hirofumi Ota; Naoto Iwase; Yuki Ichihara; Junpei Komiyama; Masaaki Imaizumi

arXiv:2605.05873·stat.ML·May 8, 2026

CITE: Anytime-Valid Statistical Inference in LLM Self-Consistency

Hirofumi Ota, Naoto Iwase, Yuki Ichihara, Junpei Komiyama, Masaaki Imaizumi

PDF

TL;DR

This paper introduces CITE, a new statistical inference method for LLM self-consistency that guarantees error control regardless of data-driven stopping rules and without prior knowledge of answer categories.

Contribution

The paper presents the CITE algorithm, providing provable error control, a category-set-size-free stopping rule, and extensions to confidence-weighted voting, advancing LLM self-consistency certification.

Findings

01

CITE controls false certification at any prescribed level.

02

The method achieves a category-set-size-free stopping-time rate.

03

Experiments show improved certification accuracy in diffuse-tail settings.

Abstract

Large language models often improve reasoning by sampling multiple outputs and aggregating their final answers, but precise and efficient control of error levels remains a challenging task. In particular, deciding when to stop sampling remains difficult when the stopping rule is data-dependent and the set of possible answers is not known in advance. We study anytime-valid certification of a prespecified target answer as the unique mode of the model's response distribution, a guarantee distinct from answer correctness. We propose the Certification by Intersection-union Testing with E-processes (CITE) algorithm, which provably controls false certification at any prescribed level under arbitrary data-driven stopping, without requiring prior knowledge of the answer category set. We also prove an category-set-size-free stopping-time rate, establish matching minimax lower bounds up to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.