Learning When to Sample: Confidence-Aware Self-Consistency for Efficient LLM Chain-of-Thought Reasoning

Juming Xiong; Kevin Guo; Congning Ni; Chao Yan; Katherine Brown; Avinash Baidya; Xiang Gao; Bradley Malin; Zhijun Yin

arXiv:2603.08999·cs.CL·March 19, 2026

Learning When to Sample: Confidence-Aware Self-Consistency for Efficient LLM Chain-of-Thought Reasoning

Juming Xiong, Kevin Guo, Congning Ni, Chao Yan, Katherine Brown, Avinash Baidya, Xiang Gao, Bradley Malin, Zhijun Yin

PDF

Open Access

TL;DR

This paper presents a confidence-aware framework that adaptively chooses between single and multiple reasoning paths in LLMs, significantly reducing inference costs while maintaining high accuracy in reasoning tasks.

Contribution

It introduces a novel decision framework that analyzes a single reasoning trajectory to efficiently balance accuracy and computational cost without additional fine-tuning.

Findings

01

Achieves comparable accuracy to multi-path methods with up to 80% fewer tokens.

02

Effectively generalizes across multiple datasets without fine-tuning.

03

Utilizes signals from reasoning trajectories for uncertainty estimation.

Abstract

Large language models (LLMs) achieve strong reasoning performance through chain-of-thought (CoT) reasoning, yet often generate unnecessarily long reasoning paths that incur high inference cost. Recent self-consistency-based approaches further improve accuracy but require sampling and aggregating multiple reasoning trajectories, leading to substantial additional computational overhead. This paper introduces a confidence-aware decision framework that analyzes a single completed reasoning trajectory to adaptively select between single-path and multi-path reasoning. The framework is trained using sentence-level numeric and linguistic features extracted from intermediate reasoning states in the MedQA dataset and generalizes effectively to MathQA, MedMCQA, and MMLU without additional fine-tuning. Experimental results show that the proposed method maintains accuracy comparable to multi-path…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Machine Learning in Healthcare · Advanced Graph Neural Networks