Exploring the Generalizability of Factual Hallucination Mitigation via Enhancing Precise Knowledge Utilization

Siyuan Zhang; Yichi Zhang; Yinpeng Dong; Hang Su

arXiv:2502.19127·cs.CL·October 14, 2025

Exploring the Generalizability of Factual Hallucination Mitigation via Enhancing Precise Knowledge Utilization

Siyuan Zhang, Yichi Zhang, Yinpeng Dong, Hang Su

PDF

Open Access

TL;DR

This paper introduces PKUE, a method to improve large language models' ability to accurately utilize knowledge, reducing factual hallucinations and enhancing performance across diverse factual and general tasks.

Contribution

The paper proposes PKUE, a novel fine-tuning approach that enhances LLMs' knowledge utilization by training on self-generated responses to factual questions.

Findings

01

PKUE significantly improves factual accuracy in LLMs.

02

Enhanced performance across multiple languages and task types.

03

FactualBench dataset enables comprehensive evaluation.

Abstract

Large Language Models (LLMs) often struggle to align their responses with objective facts, resulting in the issue of factual hallucinations, which can be difficult to detect and mislead users without relevant knowledge. Although post-training techniques have been employed to mitigate the issue, existing methods usually suffer from poor generalization and trade-offs in other different capabilities. In this paper, we propose to address these by directly augmenting LLM's fundamental ability to precisely leverage its knowledge and introduce PKUE (Precise Knowledge Utilization Enhancement), which fine-tunes the model on self-generated responses to precise and simple factual questions through preference optimization. Furthermore, we construct FactualBench, a comprehensive and precise factual QA dataset containing 181k Chinese data spanning 21 domains, to facilitate both evaluation and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsData-Driven Disease Surveillance · Psychology of Moral and Emotional Judgment

MethodsALIGN