Vertex-centric Parallel Computation of SQL Queries
Ainur Smagulova, Alin Deutsch

TL;DR
This paper introduces TAG-join, a vertex-centric parallel algorithm for executing SQL queries that matches the efficiency of state-of-the-art join algorithms and outperforms traditional RDBMSs and Spark SQL in various settings.
Contribution
It presents a novel vertex-centric scheme for SQL query execution that is efficient, scalable, and competitive with existing database systems.
Findings
TAG-join matches the complexity of top join algorithms.
On a single server, TAG-join outperforms traditional RDBMSs on TPC benchmarks.
In distributed settings, TAG-join surpasses Spark SQL.
Abstract
We present a scheme for parallel execution of SQL queries on top of any vertex-centric BSP graph processing engine. The scheme comprises a graph encoding of relational instances and a vertex program specification of our algorithm called TAG-join, which matches the theoretical communication and computation complexity of state-of-the-art join algorithms. When run on top of the vertex-centric TigerGraph database engine on a single multi-core server, TAG-join exploits thread parallelism and is competitive with (and often outperforms) reference RDBMSs on the TPC benchmarks they are traditionally tuned for. In a distributed cluster, TAG-join outperforms the popular Spark SQL engine.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
