Confidence Before Answering: A Paradigm Shift for Efficient LLM Uncertainty Estimation

Changcheng Li; Jiancan Wu; Hengheng Zhang; Zhengsu Chen; Guo An; Junxiang Qiu; Xiang Wang; Qi Tian

arXiv:2603.05881·cs.CL·March 9, 2026

Confidence Before Answering: A Paradigm Shift for Efficient LLM Uncertainty Estimation

Changcheng Li, Jiancan Wu, Hengheng Zhang, Zhengsu Chen, Guo An, Junxiang Qiu, Xiang Wang, Qi Tian

PDF

Open Access

TL;DR

This paper introduces a confidence-first approach for large language models, enabling them to estimate their answer correctness before generating responses, which improves calibration and practical usability.

Contribution

The paper proposes CoCA, a novel reinforcement learning framework that jointly optimizes confidence calibration and answer accuracy in LLMs, addressing limitations of answer-first methods.

Findings

01

Improved confidence calibration across benchmarks

02

Enhanced uncertainty discrimination capabilities

03

Maintained high answer quality

Abstract

Reliable deployment of large language models (LLMs) requires accurate uncertainty estimation. Existing methods are predominantly answer-first, producing confidence only after generating an answer, which measure the correctness of a specific response and limits practical usability. We study a confidence-first paradigm, where the model outputs its confidence before answering, interpreting this score as the model's probability of answering the question correctly under its current policy. We propose CoCA(Co-optimized Confidence and Answers), a GRPO reinforcement learning framework that jointly optimizes confidence calibration and answer accuracy via segmented credit assignment. By assigning separate rewards and group-relative advantages to confidence and answer segments, CoCA enables stable joint optimization and avoids reward hacking. Experiments across math, code, and factual QA…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Expert finding and Q&A systems