Parallel Query Processing with Heterogeneous Machines
Simon Frisk, Paraschos Koutris

TL;DR
This paper investigates the problem of parallel conjunctive query processing on heterogeneous machines, establishing bounds and algorithms for minimizing maximum machine cost in a single communication round.
Contribution
It introduces bounds and algorithms for parallel query processing with heterogeneous machine costs, extending previous models to more general cost functions and diverse database schemas.
Findings
Established lower and upper bounds for uniform relation cardinalities.
Extended bounds to databases with relations of different sizes.
Provided algorithms for specific queries like join, star, and triangle.
Abstract
We study the problem of computing a full Conjunctive Query in parallel using heterogeneous machines. Our computational model is similar to the MPC model, but each machine has its own cost function mapping from the number of bits it receives to a cost. An optimal algorithm should minimize the maximum cost across all machines. We consider algorithms over a single communication round and give a lower bound and matching upper bound for databases where each relation has the same cardinality. We do this for both linear cost functions like in previous work, but also for more general cost functions. For databases with relations of different cardinalities, we also find a lower bound, and give matching upper bounds for specific queries like the cartesian product, the join, the star query, and the triangle query. Our approach is inspired by the HyperCube algorithm, but there are additional…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
