K-Join: Combining Vertex Covers for Parallel Joins
Simon Frisk, Austen Fan, Paraschos Koutris

TL;DR
This paper introduces K-Join, a novel parallel join algorithm that optimally combines multiple vertex covers to minimize data transfer, advancing the efficiency of join processing in massively parallel systems.
Contribution
The paper presents a new algorithm that uses a linear combination of vertex covers and a novel hypergraph measure, the reduced quasi vertex-cover, to optimize load balancing in parallel join queries.
Findings
Achieves load characterized as n/p^{1/κ} with a new hypergraph measure
Matches or improves state-of-the-art algorithms in load efficiency
Introduces the reduced quasi vertex-cover as a key theoretical measure
Abstract
Significant research effort has been devoted to improving the performance of join processing in the massively parallel computation model, where the goal is to evaluate a query with the minimum possible data transfer between machines. However, it is still an open question to determine the best possible parallel algorithm for any join query. In this paper, we present an algorithm that takes a step forward in this endeavour. Our new algorithm is simple and builds on two existing ideas: data partitioning and the HyperCube primitive. The novelty in our approach comes from a careful choice of the HyperCube shares, which is done as a linear combination of multiple vertex covers. The resulting load with input size and processors is characterized as , where is a new hypergraph theoretic measure we call the reduced quasi vertex-cover. The new measure matches or…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGraph Theory and Algorithms · Algorithms and Data Compression · Advanced Database Systems and Queries
