Moving Beyond the Mean: Analyzing Variance in Software Engineering Experiments
Adrian Santos, Markku Oivo, Natalia Juristo

TL;DR
This paper emphasizes the importance of analyzing variance in software engineering experiments, showing that variance provides valuable insights into technology performance beyond mean comparisons.
Contribution
It introduces the role of variance analysis in SE experiments, illustrating its significance through simulations and a real industrial case study on TDD.
Findings
Variance can reveal technology performance differences overlooked by mean analysis.
Technologies with smaller variances may be preferable for consistent performance.
Ignoring variance may lead to misleading conclusions about technology effectiveness.
Abstract
Software Engineering (SE) experiments are traditionally analyzed with statistical tests (e.g., t-tests, ANOVAs, etc.) that assume equally spread data across treatments (i.e., the homogeneity of variances assumption). Differences across treatments' variances in SE are not seen as an opportunity to gain insights on technology performance, but instead, as a hindrance to analyze the data. We have studied the role of variance in mature experimental disciplines such as medicine. We illustrate the extent to which variance may inform on technology performance by means of simulation. We analyze a real-life industrial experiment on Test-Driven Development (TDD) where variance may impact technology desirability. Evaluating the performance of technologies just based on means-as traditionally done in SE-may be misleading. Technologies that make developers resemble more to each other (i.e.,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Statistical Methods in Clinical Trials · Software Reliability and Analysis Research
