An Empirical Study of Real-World SPARQL Queries
Mario Arias, Javier D. Fern\'andez, Miguel A. Mart\'inez-Prieto, Pablo, de la Fuente

TL;DR
This study analyzes 3 million real-world SPARQL queries to identify common patterns, focusing on structural elements and join types, to inform query engine optimization and RDF store tuning.
Contribution
It provides the first large-scale empirical analysis of real-world SPARQL queries, highlighting prevalent query structures and join patterns.
Findings
Most queries are simple with few triple patterns and joins.
Star-shaped graph patterns are most common.
Short triple pattern chains are typical.
Abstract
Understanding how users tailor their SPARQL queries is crucial when designing query evaluation engines or fine-tuning RDF stores with performance in mind. In this paper we analyze 3 million real-world SPARQL queries extracted from logs of the DBPedia and SWDF public endpoints. We aim at finding which are the most used language elements both from syntactical and structural perspectives, paying special attention to triple patterns and joins, since they are indeed some of the most expensive SPARQL operations at evaluation phase. We have determined that most of the queries are simple and include few triple patterns and joins, being Subject-Subject, Subject-Object and Object-Object the most common join types. The graph patterns are usually star-shaped and despite triple pattern chains exist, they are generally short.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSemantic Web and Ontologies · Advanced Database Systems and Queries · Data Quality and Management
