ORCE: Order-Aware Alignment of Verbalized Confidence in Large Language Models

Chen Li; Xiaoling Hu; Songzhu Zheng; Jiawei Zhou; Chao Chen

arXiv:2605.12446·cs.LG·May 13, 2026

ORCE: Order-Aware Alignment of Verbalized Confidence in Large Language Models

Chen Li, Xiaoling Hu, Songzhu Zheng, Jiawei Zhou, Chao Chen

PDF

TL;DR

This paper introduces a decoupled, order-aware framework for verbalized confidence calibration in large language models, improving confidence alignment without sacrificing answer accuracy.

Contribution

It proposes a novel method that separates answer generation from confidence estimation and uses rank-based reinforcement learning for better calibration.

Findings

01

Improves confidence calibration and failure prediction performance.

02

Largely preserves answer accuracy while enhancing confidence reliability.

03

Demonstrates effectiveness on reasoning and knowledge-intensive benchmarks.

Abstract

Large language models (LLMs) often produce answers with high certainty even when they are incorrect, making reliable confidence estimation essential for deployment in real-world scenarios. Verbalized confidence, where models explicitly state their confidence in natural language, provides a flexible and user-facing uncertainty signal that can be applied even when token logits are unavailable. However, existing verbalized-confidence methods often optimize answer generation and confidence generation jointly, which can cause confidence-alignment objectives to interfere with answer accuracy. In this work, we propose a decoupled and order-aware framework for verbalized confidence calibration. Our method first generates an answer and then estimates confidence conditioned on the fixed question--answer pair, allowing confidence optimization without directly perturbing the answer-generation…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.