Fast OLAP Query Execution in Main Memory on Large Data in a Cluster
Demian Hespe, Martin Weidner, Jonathan Dees, Peter Sanders

TL;DR
This paper presents techniques for efficient parallel execution of analytical SQL queries on large datasets in a cluster environment, optimizing hardware utilization and communication.
Contribution
It introduces a prototype system with precompiled plans, full parallelization, and efficient communication for large-scale OLAP queries in a cluster setting.
Findings
Successful execution of TPC-H queries on 30 TB data in a 128-node cluster.
Achieved high CPU utilization and efficient inter-node communication.
Demonstrated scalability and performance improvements over single-machine systems.
Abstract
Main memory column-stores have proven to be efficient for processing analytical queries. Still, there has been much less work in the context of clusters. Using only a single machine poses several restrictions: Processing power and data volume are bounded to the number of cores and main memory fitting on one tightly coupled system. To enable the processing of larger data sets, switching to a cluster becomes necessary. In this work, we explore techniques for efficient execution of analytical SQL queries on large amounts of data in a parallel database cluster while making maximal use of the available hardware. This includes precompiled query plans for efficient CPU utilization, full parallelization on single nodes and across the cluster, and efficient inter-node communication. We implement all features in a prototype for running a subset of TPC-H benchmark queries. We evaluate our…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Database Systems and Queries · Data Management and Algorithms · Advanced Data Storage Technologies
