Reasoning Models Better Express Their Confidence

Dongkeun Yoon; Seungone Kim; Sohee Yang; Sunkyoung Kim; Soyeon Kim; Yongil Kim; Eunbi Choi; Yireun Kim; Minjoon Seo

arXiv:2505.14489·cs.AI·October 23, 2025

Reasoning Models Better Express Their Confidence

Dongkeun Yoon, Seungone Kim, Sohee Yang, Sunkyoung Kim, Soyeon Kim, Yongil Kim, Eunbi Choi, Yireun Kim, Minjoon Seo

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper shows that reasoning models with chain-of-thought processes better express their confidence and are more accurately calibrated, especially as their reasoning unfolds, compared to non-reasoning models.

Contribution

It demonstrates that slow, iterative reasoning behaviors improve confidence calibration in large language models, a novel insight into model self-assessment.

Findings

01

Reasoning models outperform non-reasoning models in confidence calibration in most settings.

02

Slow thinking behaviors enable models to dynamically adjust and improve their confidence accuracy.

03

Guiding non-reasoning models to slow down also enhances their calibration, isolating slow thinking as the key factor.

Abstract

Despite their strengths, large language models (LLMs) often fail to communicate their confidence accurately, making it difficult to assess when they might be wrong and limiting their reliability. In this work, we demonstrate that reasoning models that engage in extended chain-of-thought (CoT) reasoning exhibit superior performance not only in problem-solving but also in accurately expressing their confidence. Specifically, we benchmark six reasoning models across six datasets and find that they achieve strictly better confidence calibration than their non-reasoning counterparts in 33 out of the 36 settings. Our detailed analysis reveals that these gains in calibration stem from the slow thinking behaviors of reasoning models (e.g., exploring alternative approaches and backtracking) which enable them to adjust their confidence dynamically throughout their CoT, making it progressively…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

mattyoon/reasoning-models-confidence
noneOfficial

Videos

Reasoning Models Better Express Their Confidence· slideslive

Taxonomy

TopicsSemantic Web and Ontologies