Benchmarking Blunders and Things That Go Bump in the Night

Neil J. Gunther

arXiv:cs/0404043·cs.PF·September 29, 2009·3 cites

Benchmarking Blunders and Things That Go Bump in the Night

Neil J. Gunther

PDF

Open Access

TL;DR

This paper discusses common pitfalls in benchmarking computer systems, illustrating how to avoid systematic mistakes through real-world examples and simple performance models to ensure accurate performance assessment.

Contribution

It introduces practical methods and models to identify and correct benchmarking errors, improving the reliability of performance measurements.

Findings

01

Benchmark flaws can be identified and corrected using simple performance models.

02

Misinterpretation of benchmark data often leads to incorrect conclusions.

03

Proper benchmarking practices prevent costly mistakes in system deployment.

Abstract

Benchmarking; by which I mean any computer system that is driven by a controlled workload, is the ultimate in performance testing and simulation. Aside from being a form of institutionalized cheating, it also offer countless opportunities for systematic mistakes in the way the workloads are applied and the resulting measurements interpreted. Right test, wrong conclusion is a ubiquitous mistake that happens because test engineers tend to treat data as divine. Such reverence is not only misplaced, it's also a sure ticket to production hell when the application finally goes live. I demonstrate how such mistakes can be avoided by means of two war stories that are real WOPRs. (a) How to resolve benchmark flaws over the psychic hotline and (b) How benchmarks can go flat with too much Java juice. In each case I present simple performance models and show how they can be applied to correctly…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSoftware System Performance and Reliability · Mobile Agent-Based Network Management · Software Testing and Debugging Techniques