Lie to Me: Knowledge Graphs for Robust Hallucination Self-Detection in LLMs
Sahil Kale, Antonio Luca Alfeo

TL;DR
This paper introduces a knowledge graph-based method to improve hallucination detection in large language models, significantly enhancing accuracy and reliability in identifying false statements.
Contribution
It presents a novel, simple approach that converts LLM responses into knowledge graphs to better detect hallucinations, outperforming existing methods.
Findings
Up to 16% improvement in detection accuracy
Up to 20% improvement in F1-score
Effective across multiple LLMs and datasets
Abstract
Hallucinations, the generation of apparently convincing yet false statements, remain a major barrier to the safe deployment of LLMs. Building on the strong performance of self-detection methods, we examine the use of structured knowledge representations, namely knowledge graphs, to improve hallucination self-detection. Specifically, we propose a simple yet powerful approach that enriches hallucination self-detection by (i) converting LLM responses into knowledge graphs of entities and relations, and (ii) using these graphs to estimate the likelihood that a response contains hallucinations. We evaluate the proposed approach using two widely used LLMs, GPT-4o and Gemini-2.5-Flash, across two hallucination detection datasets. To support more reliable future benchmarking, one of these datasets has been manually curated and enhanced and is released as a secondary outcome of this work.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Advanced Graph Neural Networks · Topic Modeling
