KnowRL: Exploring Knowledgeable Reinforcement Learning for Factuality

Baochang Ren; Shuofei Qiao; Da Zheng; Huajun Chen; Ningyu Zhang

arXiv:2506.19807·cs.AI·April 17, 2026

KnowRL: Exploring Knowledgeable Reinforcement Learning for Factuality

Baochang Ren, Shuofei Qiao, Da Zheng, Huajun Chen, Ningyu Zhang

PDF

2 Repos 3 Models 2 Datasets

TL;DR

KnowRL introduces a factuality reward mechanism into reinforcement learning to reduce hallucinations in large language models during reasoning, improving factual accuracy without sacrificing reasoning ability.

Contribution

It presents a novel knowledge-enhanced RL approach that guides models to perform fact-based slow thinking, addressing hallucination issues in LLMs.

Findings

01

KnowRL significantly reduces hallucinations across multiple datasets.

02

The method maintains the original reasoning capabilities of models.

03

Factuality rewards improve the recognition of knowledge boundaries during reasoning.

Abstract

Large Language Models (LLMs), particularly slow-thinking models, often exhibit severe hallucination, outputting incorrect content due to an inability to accurately recognize knowledge boundaries during reasoning. While Reinforcement Learning (RL) can enhance complex reasoning abilities, its outcome-oriented reward mechanism often lacks factual supervision over the thinking process, further exacerbating the hallucination problem. To address the high hallucination in slow-thinking models, we propose Knowledge-enhanced RL, KnowRL. KnowRL guides models to perform fact-based slow thinking by integrating a factuality reward, based on knowledge verification, into the RL training process, helping them recognize their knowledge boundaries. KnowRL guides models to perform fact-based slow thinking by integrating a factuality reward, based on knowledge verification, into the RL training process,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Models

Datasets

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.