CITE: Anytime-Valid Statistical Inference in LLM Self-Consistency
Hirofumi Ota, Naoto Iwase, Yuki Ichihara, Junpei Komiyama, Masaaki Imaizumi

TL;DR
This paper introduces CITE, a new statistical inference method for LLM self-consistency that guarantees error control regardless of data-driven stopping rules and without prior knowledge of answer categories.
Contribution
The paper presents the CITE algorithm, providing provable error control, a category-set-size-free stopping rule, and extensions to confidence-weighted voting, advancing LLM self-consistency certification.
Findings
CITE controls false certification at any prescribed level.
The method achieves a category-set-size-free stopping-time rate.
Experiments show improved certification accuracy in diffuse-tail settings.
Abstract
Large language models often improve reasoning by sampling multiple outputs and aggregating their final answers, but precise and efficient control of error levels remains a challenging task. In particular, deciding when to stop sampling remains difficult when the stopping rule is data-dependent and the set of possible answers is not known in advance. We study anytime-valid certification of a prespecified target answer as the unique mode of the model's response distribution, a guarantee distinct from answer correctness. We propose the Certification by Intersection-union Testing with E-processes (CITE) algorithm, which provably controls false certification at any prescribed level under arbitrary data-driven stopping, without requiring prior knowledge of the answer category set. We also prove an category-set-size-free stopping-time rate, establish matching minimax lower bounds up to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
