Hallucination to Truth: A Review of Fact-Checking and Factuality Evaluation in Large Language Models

Subhey Sadi Rahman; Md. Adnanul Islam; Md. Mahbub Alam; Musarrat Zeba; Md. Abdur Rahman; Sadia Sultana Chowa; Mohaimenul Azam Khan Raiaan; Sami Azam

arXiv:2508.03860·cs.CL·January 6, 2026

Hallucination to Truth: A Review of Fact-Checking and Factuality Evaluation in Large Language Models

Subhey Sadi Rahman, Md. Adnanul Islam, Md. Mahbub Alam, Musarrat Zeba, Md. Abdur Rahman, Sadia Sultana Chowa, Mohaimenul Azam Khan Raiaan, Sami Azam

PDF

TL;DR

This review examines how large language models generate misinformation and evaluates current fact-checking methods, emphasizing the need for improved frameworks, external evidence validation, and domain-specific tuning to enhance factual accuracy.

Contribution

It provides a comprehensive analysis of recent fact-checking techniques for LLMs, highlighting limitations and proposing research directions for more reliable evaluation methods.

Findings

01

Current metrics have significant limitations in assessing factual accuracy.

02

External evidence validation improves fact-checking reliability.

03

Domain-specific tuning enhances LLM factual consistency.

Abstract

Large Language Models (LLMs) are trained on vast and diverse internet corpora that often include inaccurate or misleading content. Consequently, LLMs can generate misinformation, making robust fact-checking essential. This review systematically analyzes how LLM-generated content is evaluated for factual accuracy by exploring key challenges such as hallucinations, dataset limitations, and the reliability of evaluation metrics. The review emphasizes the need for strong fact-checking frameworks that integrate advanced prompting strategies, domain-specific fine-tuning, and retrieval-augmented generation (RAG) methods. It proposes five research questions that guide the analysis of the recent literature from 2020 to 2025, focusing on evaluation methods and mitigation techniques. Instruction tuning, multi-agent reasoning, and RAG frameworks for external knowledge access are also reviewed. The…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.