Tell-Tale Tail Latencies: Pitfalls and Perils in Database Benchmarking
Michael Fruth, Stefanie Scherzinger, Wolfgang Mauerer, Ralf Ramsauer

TL;DR
This paper highlights the importance of accurately measuring tail latencies in database benchmarking, identifies pitfalls in current Java-based approaches, and advocates for redesigned benchmarks to better capture true performance characteristics.
Contribution
It uncovers specific issues in current benchmarking methods that distort tail latency measurements and proposes redesign strategies for more faithful performance evaluation.
Findings
Java benchmarking can significantly distort tail latency measurements.
Focusing solely on throughput overlooks critical tail latency issues.
Redesigned benchmarks can improve the accuracy of tail latency characterization.
Abstract
The performance of database systems is usually characterised by their average-case (i.e., throughput) behaviour in standardised or de-facto standard benchmarks like TPC-X or YCSB. While tails of the latency (i.e., response time) distribution receive considerably less attention, they have been identified as a threat to the overall system performance: In large-scale systems, even a fraction of requests delayed can build up into delays perceivable by end users. To eradicate large tail latencies from database systems, the ability to faithfully record them, and likewise pinpoint them to the root causes, is imminently required. In this paper, we address the challenge of measuring tail latencies using standard benchmarks, and identify subtle perils and pitfalls. In particular, we demonstrate how Java-based benchmarking approaches can substantially distort tail latency observations, and discuss…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware System Performance and Reliability · Distributed systems and fault tolerance · Cloud Computing and Resource Management
