Forecasting the cost of processing multi-join queries via hashing for main-memory databases (Extended version)
Feilong Liu, Spyros Blanas (The Ohio State University)

TL;DR
This paper develops and validates a memory I/O cost model for predicting multi-join query performance in in-memory databases, revealing that traditional plan choices may be suboptimal in modern shared-everything architectures.
Contribution
It introduces a new cost model tailored for in-memory multi-join query evaluation and demonstrates that conventional wisdom on join tree structures does not always hold in shared-everything memory systems.
Findings
The cost model accurately predicts query performance across different systems.
Performance differences between left-deep and right-deep plans can reach 10X with more joins.
Shared-everything memory architectures alter traditional join plan effectiveness.
Abstract
Database management systems (DBMSs) carefully optimize complex multi-join queries to avoid expensive disk I/O. As servers today feature tens or hundreds of gigabytes of RAM, a significant fraction of many analytic databases becomes memory-resident. Even after careful tuning for an in-memory environment, a linear disk I/O model such as the one implemented in PostgreSQL may make query response time predictions that are up to 2X slower than the optimal multi-join query plan over memory-resident data. This paper introduces a memory I/O cost model to identify good evaluation strategies for complex query plans with multiple hash-based equi-joins over memory-resident data. The proposed cost model is carefully validated for accuracy using three different systems, including an Amazon EC2 instance, to control for hardware-specific differences. Prior work in parallel query evaluation has advocated…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
