Spider4SSC & S2CLite: A text-to-multi-query-language dataset using lightweight ontology-agnostic SPARQL to Cypher parser
Martin Vejvar, Yasutaka Fujimoto

TL;DR
This paper introduces S2CLite, a lightweight, rule-based parser that translates SPARQL to Cypher without external tools, significantly improving parsing accuracy and enabling a new unified text-to-query dataset.
Contribution
The paper presents S2CLite, a novel ontology-agnostic, rule-based parser for SPARQL to Cypher translation, and introduces the Spider4SSC dataset for multi-query language tasks.
Findings
S2CLite achieves 77.8% parsing accuracy on Spider4SPARQL.
S2CLite outperforms S2CTrans in execution accuracy (96.6%).
The Spider4SSC dataset contains 4525 questions with multi-query language equivalents.
Abstract
We present Spider4SSC dataset and S2CLite parsing tool. S2CLite is a lightweight, ontology-agnostic parser that translates SPARQL queries into Cypher queries, enabling both in-situ and large-scale SPARQL to Cypher translation. Unlike existing solutions, S2CLite is purely rule-based (inspired by traditional programming language compilers) and operates without requiring an RDF graph or external tools. Experiments conducted on the BSBM42 and Spider4SPARQL datasets show that S2CLite significantly reduces query parsing errors, achieving a total parsing accuracy of 77.8% on Spider4SPARQL compared to 44.2% by the state-of-the-art S2CTrans. Furthermore, S2CLite achieved a 96.6\% execution accuracy on the intersecting subset of queries parsed by both parsers, outperforming S2CTrans by 7.3%. We further use S2CLite to parse Spider4SPARQL queries to Cypher and generate Spider4SSC, a unified…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Semantic Web and Ontologies · Data Quality and Management
