Enhancing Reliability across Short and Long-Form QA via Reinforcement Learning

Yudong Wang; Zhe Yang; Wenhan Ma; Zhifang Sui; Liang Zhao

arXiv:2512.08944·cs.CL·December 11, 2025

Enhancing Reliability across Short and Long-Form QA via Reinforcement Learning

Yudong Wang, Zhe Yang, Wenhan Ma, Zhifang Sui, Liang Zhao

PDF

Open Access

TL;DR

This paper introduces a reinforcement learning framework that reduces hallucinations in large language models for short and long-form QA, improving reliability without sacrificing reasoning ability.

Contribution

It presents a novel RL approach that mitigates both intrinsic and extrinsic hallucinations and encourages cautiousness in answering unanswerable questions.

Findings

01

Significant reduction in hallucinations across benchmarks

02

Improved factual accuracy and reliability

03

Enhanced model cautiousness in unanswerable cases

Abstract

While reinforcement learning has unlocked unprecedented complex reasoning in large language models, it has also amplified their propensity for hallucination, creating a critical trade-off between capability and reliability. This work confronts this challenge by introducing a targeted RL framework designed to mitigate both intrinsic and extrinsic hallucinations across short and long-form question answering. We address extrinsic hallucinations (flawed internal knowledge) by creating a novel training set from open-ended conversions of TriviaQA. Concurrently, we tackle intrinsic hallucinations (unfaithfulness to context) by leveraging long-form texts from FineWeb in a fact-grounding reward scheme. To further bolster reliability, our framework explicitly rewards the model for refusing to answer unanswerable questions, thereby cultivating crucial cautiousness. Extensive experiments…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Multimodal Machine Learning Applications · Advanced Graph Neural Networks