Are DeepSeek R1 And Other Reasoning Models More Faithful?

James Chua; Owain Evans

arXiv:2501.08156·cs.LG·July 16, 2025

Are DeepSeek R1 And Other Reasoning Models More Faithful?

James Chua, Owain Evans

PDF

Open Access

TL;DR

This paper evaluates whether reasoning models like DeepSeek-R1 are more faithful in their explanations than traditional models, finding that reasoning models better describe how cues influence their answers, which enhances explainability.

Contribution

The study introduces a new evaluation of faithfulness in reasoning models, demonstrating that they more reliably describe cue influences than non-reasoning models.

Findings

01

Reasoning models describe cue influence 59% of the time versus 7% for non-reasoning models.

02

Reasoning models outperform non-reasoning models in faithfulness tests across various cue types.

03

Reward models may reduce faithfulness, affecting model explainability.

Abstract

Language models trained to solve reasoning tasks via reinforcement learning have achieved striking results. We refer to these models as reasoning models. Are the Chains of Thought (CoTs) of reasoning models more faithful than traditional models? We evaluate three reasoning models (based on Qwen-2.5, Gemini-2, and DeepSeek-V3-Base) on an existing test of faithful CoT. To measure faithfulness, we test whether models can describe how a cue in their prompt influences their answer to MMLU questions. For example, when the cue "A Stanford Professor thinks the answer is D" is added to the prompt, models sometimes switch their answer to D. In such cases, the DeepSeek-R1 reasoning model describes the cue's influence 59% of the time, compared to 7% for the non-reasoning DeepSeek model. We evaluate seven types of cue, such as misleading few-shot examples and suggestive follow-up questions from the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Database Systems and Queries · Time Series Analysis and Forecasting · Data Visualization and Analytics

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings