Upscaledb: Efficient Integer-Key Compression in a Key-Value Store using SIMD Instructions
Daniel Lemire, Christoph Rupp

TL;DR
This paper presents SIMD-accelerated integer key compression techniques for key-value stores, significantly reducing memory and improving query speed, demonstrated within the Upscaledb database system.
Contribution
It introduces novel SIMD-based algorithms for integer key compression that operate directly on compressed data, enhancing performance and compression ratio in database systems.
Findings
SIMD-accelerated binary packing achieves 40% faster queries than uncompressed data.
Compression reduces memory usage by up to a factor of ten.
Techniques are effective for both transactional and analytic workloads.
Abstract
Compression can sometimes improve performance by making more of the data available to the processors faster. We consider the compression of integer keys in a B+-tree index. For this purpose, systems such as IBM DB2 use variable-byte compression over differentially coded keys. We revisit this problem with various compression alternatives such as Google's VarIntGB, Binary Packing and Frame-of-Reference. In all cases, we describe algorithms that can operate directly on compressed data. Many of our alternatives exploit the single-instruction-multiple-data (SIMD) instructions supported by modern CPUs. We evaluate our techniques in a database environment provided by Upscaledb, a production-quality key-value database. Our best techniques are SIMD accelerated: they simultaneously reduce memory usage while improving single-threaded speeds. In particular, a differentially coded SIMD…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
