Mitigating LLM Hallucination via Behaviorally Calibrated Reinforcement Learning

Jiayun Wu; Jiashuo Liu; Zhiyuan Zeng; Tianyang Zhan; Tianle Cai; Wenhao Huang

arXiv:2512.19920·cs.LG·January 29, 2026

Mitigating LLM Hallucination via Behaviorally Calibrated Reinforcement Learning

Jiayun Wu, Jiashuo Liu, Zhiyuan Zeng, Tianyang Zhan, Tianle Cai, Wenhao Huang

PDF

Open Access

TL;DR

This paper introduces behaviorally calibrated reinforcement learning to reduce hallucinations in large language models by encouraging uncertainty estimation and abstention, improving factual reliability without sacrificing accuracy.

Contribution

It proposes and evaluates training methods that optimize proper scoring rules, enabling models to better calibrate their confidence and abstain when uncertain.

Findings

01

Smaller models outperform larger ones in uncertainty calibration.

02

Model's accuracy-to-hallucination ratio significantly improves.

03

Zero-shot calibration error matches frontier models in factual QA.

Abstract

LLM deployment in critical domains is currently impeded by persistent hallucinations--generating plausible but factually incorrect assertions. While scaling laws drove significant improvements in general capabilities, theoretical frameworks suggest hallucination is not merely stochastic error but a predictable statistical consequence of training objectives prioritizing mimicking data distribution over epistemic honesty. Standard RLVR paradigms, utilizing binary reward signals, inadvertently incentivize models as good test-takers rather than honest communicators, encouraging guessing whenever correctness probability exceeds zero. This paper presents an exhaustive investigation into behavioral calibration, which incentivizes models to stochastically admit uncertainty by abstaining when not confident, aligning model behavior with accuracy. Synthesizing recent advances, we propose and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Topic Modeling · Advanced Graph Neural Networks