Scavenger: Better Space-Time Trade-Offs for Key-Value Separated LSM-trees
Jianshun Zhang, Fang Wang, Sheng Qiu, Yi Wang, Jiaxin Ou, Junxun Huang, Baoquan Li, Peng Fang, Dan Feng

TL;DR
This paper introduces Scavenger, a novel approach for KV-separated LSM-trees that optimizes the trade-off between performance and space amplification, reducing I/O overhead and improving write efficiency.
Contribution
Scavenger presents an I/O-efficient garbage collection scheme and a space-aware compaction strategy tailored for KV-separated LSM-trees, addressing space amplification issues.
Findings
Significantly reduces space amplification compared to existing methods.
Improves write performance through optimized garbage collection.
Outperforms BlobDB, Titan, and TerarkDB in experiments.
Abstract
Key-Value Stores (KVS) implemented with log-structured merge-tree (LSM-tree) have gained widespread acceptance in storage systems. Nonetheless, a significant challenge arises in the form of high write amplification due to the compaction process. While KV-separated LSM-trees successfully tackle this issue, they also bring about substantial space amplification problems, a concern that cannot be overlooked in cost-sensitive scenarios. Garbage collection (GC) holds significant promise for space amplification reduction, yet existing GC strategies often fall short in optimization performance, lacking thorough consideration of workload characteristics. Additionally, current KV-separated LSM-trees also ignore the adverse effect of the space amplification in the index LSM-tree. In this paper, we systematically analyze the sources of space amplification of KV-separated LSM-trees and introduce…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
