Quantifying Performance Changes with Effect Size Confidence Intervals

Tomas Kalibera; Richard Jones

arXiv:2007.10899·stat.ME·July 22, 2020·23 cites

Quantifying Performance Changes with Effect Size Confidence Intervals

Tomas Kalibera, Richard Jones

PDF

Open Access

TL;DR

This paper introduces a statistical framework for quantifying uncertainty in performance measurements, improving the rigor and interpretability of experimental results in systems research.

Contribution

It presents a novel statistical model that accounts for non-determinism and provides confidence intervals for performance ratios, enhancing reproducibility and validity.

Findings

01

Provides a method to compute confidence intervals for execution time ratios

02

Addresses non-determinism in performance measurements

03

Enables clearer, more reliable performance comparisons

Abstract

Measuring performance & quantifying a performance change are core evaluation techniques in programming language and systems research. Of 122 recent scientific papers, as many as 65 included experimental evaluation that quantified a performance change using a ratio of execution times. Few of these papers evaluated their results with the level of rigour that has come to be expected in other experimental sciences. The uncertainty of measured results was largely ignored. Scarcely any of the papers mentioned uncertainty in the ratio of the mean execution times, and most did not even mention uncertainty in the two means themselves. Most of the papers failed to address the non-deterministic execution of computer programs (caused by factors such as memory placement, for example), and none addressed non-deterministic compilation. It turns out that the statistical methods presented in the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSoftware System Performance and Reliability · Software Engineering Research · Parallel Computing and Optimization Techniques