Joint Evaluation of Answer and Reasoning Consistency for Hallucination Detection in Large Reasoning Models
Changyue Wang, Weihang Su, Qingyao Ai, Yiqun Liu

TL;DR
This paper introduces RACE, a framework for detecting hallucinations in large reasoning models by evaluating the consistency between answers and reasoning traces, improving detection accuracy over existing methods.
Contribution
The paper proposes RACE, a novel approach that jointly assesses reasoning trace consistency and answer uncertainty to effectively detect hallucinations in large reasoning models.
Findings
RACE outperforms existing hallucination detection methods.
Joint evaluation of reasoning and answer improves detection robustness.
RACE is applicable across different datasets and large language models.
Abstract
Large Reasoning Models (LRMs) extend large language models with explicit, multi-step reasoning traces to enhance transparency and performance on complex tasks. However, these reasoning traces can be redundant or logically inconsistent, becoming a new and hard-to-detect source of hallucination. Existing hallucination detection methods focus primarily on answer-level uncertainty and often fail to detect hallucinations or logical inconsistencies arising from the model's reasoning trace. This oversight is particularly problematic for LRMs, where the explicit thinking trace is not only an important support to the model's decision-making process but also a key source of potential hallucination. To this end, we propose RACE (Reasoning and Answer Consistency Evaluation), a novel framework specifically tailored for hallucination detection in LRMs. RACE operates by extracting essential reasoning…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsTopic Modeling · Advanced Graph Neural Networks · Machine Learning in Healthcare
MethodsFocus
