Parallelizing Query Optimization on Shared-Nothing Architectures

Immanuel Trummer; Christoph Koch

arXiv:1511.01768·cs.DB·November 6, 2015

Parallelizing Query Optimization on Shared-Nothing Architectures

Immanuel Trummer, Christoph Koch

PDF

Open Access

TL;DR

This paper introduces scalable algorithms for parallel query optimization on shared-nothing architectures, significantly reducing optimization time by dividing the plan space among multiple workers with minimal synchronization.

Contribution

It presents novel parallel algorithms for query optimization that efficiently utilize large clusters without extensive data exchange or synchronization.

Findings

01

Achieved up to 10x speedup on 100-node clusters.

02

Parallelization scales linearly with the number of workers and query size.

03

Effective for large queries with long optimization times.

Abstract

Data processing systems offer an ever increasing degree of parallelism on the levels of cores, CPUs, and processing nodes. Query optimization must exploit high degrees of parallelism in order not to gradually become the bottleneck of query evaluation. We show how to parallelize query optimization at a massive scale. We present algorithms for parallel query optimization in left-deep and bushy plan spaces. At optimization start, we divide the plan space for a given query into partitions of equal size that are explored in parallel by worker nodes. At the end of optimization, each worker returns the optimal plan in its partition to the master which determines the globally optimal plan from the partition-optimal plans. No synchronization or data exchange is required during the actual optimization phase. The amount of data sent over the network, at the start and at the end of optimization,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Database Systems and Queries · Data Management and Algorithms · Graph Theory and Algorithms