Traceable LLM-based validation of statements in knowledge graphs

Daniel Adam; Tom\'a\v{s} Kliegr

arXiv:2409.07507·cs.AI·June 12, 2025

Traceable LLM-based validation of statements in knowledge graphs

Daniel Adam, Tom\'a\v{s} Kliegr

PDF

Open Access 1 Repo

TL;DR

This paper introduces a traceable LLM-based method for validating RDF triples in knowledge graphs by leveraging external document retrieval, achieving high precision but requiring human oversight, and demonstrating potential for large-scale verification.

Contribution

The paper proposes a retrieval-augmented generation workflow that verifies knowledge graph statements without relying on internal LLM knowledge, enhancing traceability and applicability in biosciences.

Findings

01

Precision of 88% on BioRED dataset

02

Recall of 44% indicating need for human oversight

03

Effective on Wikidata for large-scale statement verification

Abstract

This article presents a method for verifying RDF triples using LLMs, with an emphasis on providing traceable arguments. Because the LLMs cannot currently reliably identify the origin of the information used to construct the response to the user prompt, our approach is to avoid using internal LLM factual knowledge altogether. Instead, verified RDF statements are compared to chunks of external documents retrieved through a web search or Wikipedia. To assess the possible application of this retrieval augmented generation (RAG) workflow on biosciences content, we evaluated 1,719 positive statements from the BioRED dataset and the same number of newly generated negative statements. The resulting precision is 88 %, and recall is 44 %. This indicates that the method requires human oversight. We also evaluated the method on the SNLI dataset, which allowed us to compare our approach with models…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

danieladam2001/llm-based-validation
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsData Quality and Management · Semantic Web and Ontologies