A Benchmark for Databases with Varying Value Lengths
Danushka Liyanage, Shubham Pandey, Joshua Goldstein, Michael Cahill, Akon Dey, Alan Fekete, Uwe R\"ohm

TL;DR
This paper introduces a new benchmark extension to evaluate how database management systems handle variable-length data, revealing performance differences influenced by storage engine design and data growth, which were previously underexplored.
Contribution
It extends the YCSB benchmark to simulate value growth over time, enabling analysis of DBMS performance with dynamic, variable-sized data.
Findings
Performance varies significantly with value size growth
Storage engine design impacts handling of variable data
Benchmark reveals differences in query efficiency with growing data
Abstract
The performance of database management systems (DBMS) is traditionally evaluated using benchmarks that focus on workloads with (almost) fixed record lengths. However, some real-world workloads in key/value stores, document databases, and graph databases exhibit significant variability in value lengths, which can lead to performance anomalies, particularly when popular records grow disproportionately large. Existing benchmarks fail to account for this variability, leaving an important aspect of DBMS behavior underexplored. In this paper, we address this gap by extending the Yahoo! Cloud Serving Benchmark (YCSB) to include an "extend" operation, which appends data to record fields, simulating the growth of values over time. Using this modified benchmark, we have measured the performance of three popular DBMS backends: MongoDB, MariaDB with the InnoDB storage engine, and MariaDB with the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Database Systems and Queries
