BVLSM: Write-Efficient LSM-Tree Storage via WAL-Time Key-Value Separation
Ming Li, Wendi Cheng, Jiahe Wei, Xueqiang Shan, Weikai Liu, Xiaonan Zhao, Xiao Zhang

TL;DR
BVLSM introduces a proactive key-value separation mechanism during WAL to improve write efficiency, memory utilization, and reduce I/O jitter in LSM-Tree storage for big-value workloads, outperforming existing systems.
Contribution
It proposes a novel WAL-time key-value separation approach that reduces redundancy and enhances performance in LSM-Tree based key-value stores.
Findings
7.6x throughput improvement over RocksDB
1.9x throughput improvement over BlobDB
Significant reduction in write amplification and I/O jitter
Abstract
Modern data-intensive applications increasingly store and process big-value items, such as multimedia objects and machine learning embeddings, which exacerbate storage inefficiencies in Log-Structured Merge-Tree (LSM)-based key-value stores. This paper presents BVLSM, a Write-Ahead Log (WAL)-time key-value separation mechanism designed to address three key challenges in LSM-Tree storage systems: write amplification, poor memory utilization, and I/O jitter under big-value workloads. Unlike state-of-the-art approaches that delay key-value separation until the flush stage, leading to redundant data in MemTables and repeated writes. BVLSM proactively decouples keys and values during the WAL phase. The MemTable stores only lightweight metadata, allowing multi-queue parallel store for big value. The benchmark results show that BVLSM significantly outperforms both RocksDB and BlobDB under 64KB…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Data Storage Technologies · Caching and Content Delivery · Network Packet Processing and Optimization
