Stop Rewarding Hallucinated Steps: Faithfulness-Aware Step-Level Reinforcement Learning for Small Reasoning Models

Shuo Nie; Hexuan Deng; Chao Wang; Ruiyu Fang; Xuebo Liu; Shuangyong Song; Yu Li; Min Zhang; Xuelong Li

arXiv:2602.05897·cs.CL·February 6, 2026

Stop Rewarding Hallucinated Steps: Faithfulness-Aware Step-Level Reinforcement Learning for Small Reasoning Models

Shuo Nie, Hexuan Deng, Chao Wang, Ruiyu Fang, Xuebo Liu, Shuangyong Song, Yu Li, Min Zhang, Xuelong Li

PDF

Open Access

TL;DR

This paper introduces FaithRL, a step-level reinforcement learning approach that uses explicit faithfulness rewards and contrastive signals to reduce hallucinations in small reasoning models, improving their reliability.

Contribution

It proposes FaithRL, a novel faithfulness-aware reinforcement learning method that enhances small reasoning models by explicitly rewarding faithful intermediate steps.

Findings

01

FaithRL significantly reduces hallucinations in reasoning steps.

02

Improves faithfulness and reliability of small reasoning models.

03

Effective across multiple models and benchmarks.

Abstract

As large language models become smaller and more efficient, small reasoning models (SRMs) are crucial for enabling chain-of-thought (CoT) reasoning in resource-constrained settings. However, they are prone to faithfulness hallucinations, especially in intermediate reasoning steps. Existing mitigation methods based on online reinforcement learning rely on outcome-based rewards or coarse-grained CoT evaluation, which can inadvertently reinforce unfaithful reasoning when the final answer is correct. To address these limitations, we propose Faithfulness-Aware Step-Level Reinforcement Learning (FaithRL), introducing step-level supervision via explicit faithfulness rewards from a process reward model, together with an implicit truncated resampling strategy that generates contrastive signals from faithful prefixes. Experiments across multiple SRMs and Open-Book QA benchmarks demonstrate that…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Advanced Graph Neural Networks · Multimodal Machine Learning Applications