GSmart: An Efficient SPARQL Query Engine Using Sparse Matrix Algebra -- Full Version
Yuedan Chen, M. Tamer \"Ozsu, Guoqing Xiao, Zhuo Tang, Kenli Li

TL;DR
gSmart is a novel SPARQL query engine leveraging sparse matrix algebra to efficiently process large RDF datasets, achieving significant speedups on high-performance heterogeneous architectures.
Contribution
It introduces a matrix algebra-based approach with optimized data structures and parallel processing techniques for SPARQL query execution.
Findings
Up to 46920x speedup over existing engines on a CPU+GPU HPC system.
Effective scalability from 2 to 16 nodes with 6.9x speedup.
Efficient query evaluation using grouped incident edge-based matrix operations.
Abstract
Efficient execution of SPARQL queries over large RDF datasets is a topic of considerable interest due to increased use of RDF to encode data. Most of this work has followed either relational or graph-based approaches. In this paper, we propose an alternative query engine, called gSmart, based on matrix algebra. This approach can potentially better exploit the computing power of high-performance heterogeneous architectures that we target. gSmart incorporates: (1) grouped incident edge-based SPARQL query evaluation, in which all unevaluated edges of a vertex are evaluated together using a series of matrix operations to fully utilize query constraints and narrow down the solution space; (2) a graph query planner that determines the order in which vertices in query graphs should be evaluated; (3) memory- and computation-efficient data structures including the light-weight sparse matrix…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGraph Theory and Algorithms · Semantic Web and Ontologies · Advanced Database Systems and Queries
