FIRESPARQL: A LLM-based Framework for SPARQL Query Generation over Scholarly Knowledge Graphs
Xueli Pan, Victor de Boer, Jacco van Ossenbruggen

TL;DR
This paper introduces FIRESPARQL, a modular framework leveraging fine-tuned LLMs and retrieval-augmented generation to improve SPARQL query generation over Scholarly Knowledge Graphs, addressing structural and semantic errors.
Contribution
The paper presents FIRESPARQL, a novel framework combining fine-tuned LLMs, retrieval-augmented generation, and query correction to enhance NLQ-to-SPARQL translation for SKGs.
Findings
Fine-tuning LLMs yields highest accuracy (0.90 ROUGE-L, 0.85 RelaxedEM).
Retrieval-augmented generation improves query quality.
Framework outperforms baseline approaches.
Abstract
Question answering over Scholarly Knowledge Graphs (SKGs) remains a challenging task due to the complexity of scholarly content and the intricate structure of these graphs. Large Language Model (LLM) approaches could be used to translate natural language questions (NLQs) into SPARQL queries; however, these LLM-based approaches struggle with SPARQL query generation due to limited exposure to SKG-specific content and the underlying schema. We identified two main types of errors in the LLM-generated SPARQL queries: (i) structural inconsistencies, such as missing or redundant triples in the queries, and (ii) semantic inaccuracies, where incorrect entities or properties are shown in the queries despite a correct query structure. To address these issues, we propose FIRESPARQL, a modular framework that supports fine-tuned LLMs as a core component, with optional context provided via…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
