Communication Cost in Parallel Query Processing
Paul Beame, Paraschos Koutris, Dan Suciu

TL;DR
This paper analyzes the communication complexity of parallel query processing, providing tight bounds for one and multiple rounds, especially considering data skew and query structure, with implications for algorithms like graph connectivity.
Contribution
It introduces tight bounds on communication costs for parallel query processing, accounting for data skew and query structure, and develops algorithms and lower bounds in the MPC model.
Findings
Tight bounds for one-round algorithms on skew-free data.
Improved algorithms for skewed data with approximate degree info.
Nearly matching bounds for tree-like queries in multi-round settings.
Abstract
We study the problem of computing conjunctive queries over large databases on parallel architectures without shared storage. Using the structure of such a query and the skew in the data, we study tradeoffs between the number of processors, the number of rounds of communication, and the per-processor load -- the number of bits each processor can send or can receive in a single round -- that are required to compute . When the data is free of skew, we obtain essentially tight upper and lower bounds for one round algorithms and we show how the bounds degrade when there is skew in the data. In the case of skewed data, we show how to improve the algorithms when approximate degrees of the heavy-hitter elements are available, obtaining essentially optimal algorithms for queries such as simple joins and triangle join queries. For queries that we identify as tree-like, we also prove…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGraph Theory and Algorithms · Data Management and Algorithms · Complexity and Algorithms in Graphs
