An Adaptive Column Compression Family for Self-Driving Databases
Marcell Feh\'er, Daniel E. Lucani, Ioannis Chatzigeorgiou

TL;DR
This paper introduces an adaptive column compression method for in-memory databases that balances high compression ratios with fast query speeds, outperforming existing compressors like LZ4.
Contribution
It presents a novel adaptive compressor that improves compression efficiency while maintaining near-maximum query speed in self-driving database systems.
Findings
Achieves better compression than LZ4.
Maintains query speeds close to fastest segment encoders.
Effective on synthetic and benchmark datasets.
Abstract
Modern in-memory databases are typically used for high-performance workloads, therefore they have to be optimized for small memory footprint and high query speed at the same time. Data compression has the potential to reduce memory requirements but often reduces query speed too. In this paper we propose a novel, adaptive compressor that offers a new trade-off point of these dimensions, achieving better compression than LZ4 while reaching query speeds close to the fastest existing segment encoders. We evaluate our compressor both with synthetic data in isolation and on the TPC-H and Join Order Benchmarks, integrated into a modern relational column store, Hyrise.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAlgorithms and Data Compression · Advanced Data Storage Technologies · Advanced Database Systems and Queries
