Balancing Garbage Collection vs I/O Amplification using hybrid Key-Value Placement in LSM-based Key-Value Stores
Giorgos Xanthakis, Giorgos Saloustros, Nikos Batsaras, Anastasios, Papagiannis, Angelos Bilas

TL;DR
Parallax introduces a hybrid key-value placement strategy in LSM-based stores, reducing garbage collection overhead and I/O amplification, leading to significant performance improvements over existing methods.
Contribution
It proposes a novel hybrid KV placement approach that adapts to KV size categories, optimizing garbage collection and I/O efficiency in LSM-based key-value stores.
Findings
Up to 12.4x throughput increase
Up to 27.1x reduction in I/O amplification
Up to 28x CPU efficiency improvement
Abstract
Key-value (KV) separation is a technique that introduces randomness in the I/O access patterns to reduce I/O amplification in LSM-based key-value stores for fast storage devices (NVMe). KV separation has a significant drawback that makes it less attractive: Delete and especially update operations that are important in modern workloads result in frequent and expensive garbage collection (GC) in the value log. In this paper, we design and implement Parallax, which proposes hybrid KV placement that reduces GC overhead significantly and maximizes the benefits of using a log. We first model the benefits of KV separation for different KV pair sizes. We use this model to classify KV pairs in three categories small, medium, and large. Then, Parallax uses different approaches for each KV category: It always places large values in a log and small values in place. For medium values it uses a mixed…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Data Storage Technologies · Parallel Computing and Optimization Techniques · Caching and Content Delivery
