Design Trade-offs for a Robust Dynamic Hybrid Hash Join (Extended Version)
Shiva Jahangiri, Michael J. Carey, Johann-Christoph Freytag

TL;DR
This paper analyzes and optimizes the design trade-offs of a robust, dynamic Hybrid Hash Join algorithm, enhancing performance in database systems through extensive experiments and new partitioning strategies.
Contribution
It introduces new techniques for partition management and dynamic spilling in Hybrid Hash Join, with experimental validation in a real database system.
Findings
Optimal number of partitions improves join performance.
Partition insertion techniques enhance memory utilization.
Dynamic spilling algorithms outperform previous methods.
Abstract
The Join operator, as one of the most expensive and commonly used operators in database systems, plays a substantial role in Database Management System (DBMS) performance. Among the many different Join algorithms studied over the last decades, Hybrid Hash Join (HHJ) has proven to be one of the most efficient and widely-used join algorithms. While the performance of HHJ depends largely on accurate statistics and information about the input relations, it may not always be practical or possible for a system to have such information available. The design of HHJ depends on many details to perform well. This paper is an experimental and analytical study of the trade-offs in designing a robust and dynamic HHJ operator. We revisit the design and optimization techniques suggested by previous studies through extensive experiments, comparing them with other algorithms designed by us or used in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Database Systems and Queries · Data Management and Algorithms · Caching and Content Delivery
