Revisiting Reuse in Main Memory Database Systems

Kayhan Dursun; Carsten Binnig; Ugur Cetintemel; Tim Kraska

arXiv:1608.05678·cs.DB·August 22, 2016·1 cites

Revisiting Reuse in Main Memory Database Systems

Kayhan Dursun, Carsten Binnig, Ugur Cetintemel, Tim Kraska

PDF

Open Access

TL;DR

This paper introduces a novel reuse model for internal data structures in main memory databases, specifically focusing on hash tables, to improve analytical query performance without additional overhead.

Contribution

It proposes a cache-aware reuse model for hash tables in main memory DBMSs, optimizing query plans by considering cache locality and data movement costs.

Findings

01

Achieves 2x performance improvements on analytical workloads

02

No additional overhead for materializing intermediates

03

Employs cost models considering cache hierarchy and hash table statistics

Abstract

Reusing intermediates in databases to speed-up analytical query processing has been studied in the past. Existing solutions typically require intermediate results of individual operators to be materialized into temporary tables to be considered for reuse in subsequent queries. However, these approaches are fundamentally ill-suited for use in modern main memory databases. The reason is that modern main memory DBMSs are typically limited by the bandwidth of the memory bus, thus query execution is heavily optimized to keep tuples in the CPU caches and registers. To that end, adding additional materialization operations into a query plan not only add additional traffic to the memory bus but more importantly prevent the important cache- and register-locality opportunities resulting in high performance penalties. In this paper we study a novel reuse model for intermediates, which caches…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Database Systems and Queries · Data Management and Algorithms · Cloud Computing and Resource Management