CoT-UQ: Improving Response-wise Uncertainty Quantification in LLMs with Chain-of-Thought

Boxuan Zhang; Ruqi Zhang

arXiv:2502.17214·cs.CL·June 4, 2025

CoT-UQ: Improving Response-wise Uncertainty Quantification in LLMs with Chain-of-Thought

Boxuan Zhang, Ruqi Zhang

PDF

Open Access 1 Repo

TL;DR

CoT-UQ introduces a response-wise uncertainty quantification framework for LLMs that leverages Chain-of-Thought reasoning to improve the accuracy of uncertainty estimates, outperforming existing methods.

Contribution

This work presents a novel response-wise UQ method that integrates Chain-of-Thought reasoning to enhance uncertainty estimation in large language models.

Findings

01

CoT-UQ achieves 5.9% higher AUROC on average compared to existing UQ methods.

02

It effectively captures critical reasoning information for better uncertainty assessment.

03

The method is validated on Llama models across logical and mathematical tasks.

Abstract

Large language models (LLMs) excel in many tasks but struggle to accurately quantify uncertainty in their generated responses. This limitation makes it challenging to detect misinformation and ensure reliable decision-making. Existing uncertainty quantification (UQ) methods for LLMs are primarily prompt-wise rather than response-wise, often requiring multiple response samples, which incurs high computational costs. Moreover, LLMs have been shown to be overconfident, particularly when using reasoning steps to derive their answers. In this work, we propose CoT-UQ, a response-wise UQ framework that integrates LLMs' inherent reasoning capabilities through Chain-of-Thought (CoT) into the UQ process. CoT-UQ captures critical information during inference by extracting keywords from each reasoning step and assessing their importance to the final answer. This key reasoning information is then…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

zbox1005/cot-uq
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsScientific Computing and Data Management

MethodsLLaMA