The Hallucination Tax of Reinforcement Finetuning

Linxin Song; Taiwei Shi; Jieyu Zhao

arXiv:2505.13988·cs.CL·May 21, 2025

The Hallucination Tax of Reinforcement Finetuning

Linxin Song, Taiwei Shi, Jieyu Zhao

PDF

Open Access 2 Datasets 1 Video

TL;DR

Reinforcement finetuning improves reasoning but causes models to hallucinate more on unanswerable questions, which can be mitigated by incorporating a small amount of synthetic unanswerable data.

Contribution

This work identifies the hallucination tax as a side effect of RFT and proposes a simple data augmentation method to restore refusal behavior in LLMs.

Findings

01

RFT reduces refusal rates by over 80%, increasing hallucinations.

02

Adding 10% SUM data restores refusal behavior with minimal accuracy loss.

03

Improved uncertainty reasoning enhances out-of-domain and factual question answering.

Abstract

Reinforcement finetuning (RFT) has become a standard approach for enhancing the reasoning capabilities of large language models (LLMs). However, its impact on model trustworthiness remains underexplored. In this work, we identify and systematically study a critical side effect of RFT, which we term the hallucination tax: a degradation in refusal behavior causing models to produce hallucinated answers to unanswerable questions confidently. To investigate this, we introduce SUM (Synthetic Unanswerable Math), a high-quality dataset of unanswerable math problems designed to probe models' ability to recognize an unanswerable question by reasoning from the insufficient or ambiguous information. Our results show that standard RFT training could reduce model refusal rates by more than 80%, which significantly increases model's tendency to hallucinate. We further demonstrate that incorporating…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Datasets

Videos

The Hallucination Tax of Reinforcement Finetuning· underline

Taxonomy

TopicsFatigue and fracture mechanics · Advanced machining processes and optimization · Infrastructure Maintenance and Monitoring