SPARQLing Database Queries from Intermediate Question Decompositions
Irina Saparina, Anton Osokin

TL;DR
This paper introduces a method to translate natural language questions into SPARQL queries using intermediate representations, reducing the need for extensive query annotations and achieving competitive accuracy on the Spider dataset.
Contribution
It proposes a pipeline combining a neural parser and a transpiler to convert questions into SPARQL, leveraging simpler intermediate representations to reduce annotation effort.
Findings
Achieves comparable accuracy to state-of-the-art text-to-SQL methods.
Uses crowdsourced intermediate representations from the Break dataset.
Code and data are publicly available.
Abstract
To translate natural language questions into executable database queries, most approaches rely on a fully annotated training set. Annotating a large dataset with queries is difficult as it requires query-language expertise. We reduce this burden using grounded in databases intermediate question representations. These representations are simpler to collect and were originally crowdsourced within the Break dataset (Wolfson et al., 2020). Our pipeline consists of two parts: a neural semantic parser that converts natural language questions into the intermediate representations and a non-trainable transpiler to the SPARQL query language (a standard language for accessing knowledge graphs and semantic web). We chose SPARQL because its queries are structurally closer to our intermediate representations (compared to SQL). We observe that the execution accuracy of queries constructed by our…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Advanced Graph Neural Networks
