Hybrid Materialization in a Disk-Based Column-Store
Evgeniy Klyuchikov, Elena Mikhailova, George Chernishev

TL;DR
This paper introduces a hybrid materialization strategy for disk-based column-stores that combines positions and values, enabling more flexible query plans and nearly doubling performance compared to existing methods.
Contribution
It proposes a novel hybrid materialization model that manipulates both positions and values simultaneously, enhancing flexibility and performance in distributed column-stores.
Findings
Hybrid materialization is nearly twice as fast as existing strategies.
Supports a new class of flexible query plans.
Demonstrated effectiveness on TPC-H benchmark scenarios.
Abstract
In column-oriented query processing, a materialization strategy determines when lightweight positions (row IDs) are translated into tuples. It is an important part of column-store architecture, since it defines the class of supported query plans, and, therefore, impacts the overall system performance. In this paper we continue investigating materialization strategies for a distributed disk-based column-store. We start with demonstrating cases when existing approaches impose fundamental limitations on the resulting system performance. Then, in order to address them, we propose a new hybrid materialization model. The main feature of hybrid materialization is the ability to manipulate both positions and values at the same time. This way, query engine can flexibly combine advantages of all the existing strategies and support a new class of query plans. Moreover, hybrid materialization…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Management and Algorithms · Advanced Database Systems and Queries · Cloud Computing and Resource Management
