LLM-based SPARQL Query Generation from Natural Language over Federated   Knowledge Graphs

Vincent Emonet; Jerven Bolleman; Severine Duvaud; Tarcisio Mendes de; Farias; Ana Claudia Sima

arXiv:2410.06062·cs.DB·February 11, 2025·6 cites

LLM-based SPARQL Query Generation from Natural Language over Federated Knowledge Graphs

Vincent Emonet, Jerven Bolleman, Severine Duvaud, Tarcisio Mendes de, Farias, Ana Claudia Sima

PDF

Open Access 1 Repo

TL;DR

This paper presents a Retrieval-Augmented Generation system that uses Large Language Models and KG metadata to accurately translate natural language questions into federated SPARQL queries, with validation to improve correctness.

Contribution

It introduces a novel RAG-based approach combining LLMs and KG metadata for precise federated query generation with validation, enhancing accuracy over existing methods.

Findings

01

Improved accuracy in federated SPARQL query generation

02

Reduced hallucinations through validation step

03

System availability online at chat.expasy.org

Abstract

We introduce a Retrieval-Augmented Generation (RAG) system for translating user questions into accurate federated SPARQL queries over bioinformatics knowledge graphs (KGs) leveraging Large Language Models (LLMs). To enhance accuracy and reduce hallucinations in query generation, our system utilises metadata from the KGs, including query examples and schema information, and incorporates a validation step to correct generated queries. The system is available online at chat.expasy.org.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

sib-swiss/sparql-llm
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSemantic Web and Ontologies