Exploring the Garden of Forking Paths in Empirical Software Engineering Research: A Multiverse Analysis
Nathan Cassee, Robert Feldt

TL;DR
This paper demonstrates how multiple analytical choices in empirical software engineering can lead to vastly different results, emphasizing the need for robustness checks and better justification of methodological decisions.
Contribution
It applies multiverse analysis to an SE study, revealing the impact of analytical decisions on outcomes and proposing a structured classification model for methodological transparency.
Findings
Only 0.2% of analysis pipelines reproduced original results
Most analysis variants produced different or opposite conclusions
Multiverse analysis can improve robustness and reproducibility in SE research
Abstract
In empirical software engineering (SE) research, researchers have considerable freedom to decide how to process data, what operationalizations to use, and which statistical model to fit. Gelman and Loken refer to this freedom as leading to a "garden of forking paths". Although this freedom is often seen as an advantage, it also poses a threat to robustness and replicability: variations in analytical decisions, even when justifiable, can lead to divergent conclusions. To better understand this risk, we conducted a so-called multiverse analysis on a published empirical SE paper. The paper we picked is a Mining Software Repositories study, as MSR studies commonly use non-trivial statistical models to analyze post-hoc, observational data. In the study, we identified nine pivotal analytical decisions-each with at least one equally defensible alternative and systematically reran all the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Techniques and Practices · Software Engineering Research · Software Testing and Debugging Techniques
