Use and Misuse of the Term Experiment in Mining Software Repositories Research
Claudia Ayala, Burak Turhan, Xavier Franch, and Natalia Juristo

TL;DR
This paper examines how the term 'experiment' is used in Mining Software Repositories research, revealing widespread misuse and limited control in studies, and offers recommendations to improve experimental rigor.
Contribution
It characterizes the unique features of MSR experiments, assesses their proper use in literature, and provides guidelines to enhance experimental quality in MSR research.
Findings
19% of papers claiming to be experiments are not true experiments
Only one genuine controlled experiment was identified in the literature
Most MSR studies have limited control, affecting result interpretation
Abstract
The significant momentum and importance of Mining Software Repositories (MSR) in Software Engineering (SE) has fostered new opportunities and challenges for extensive empirical research. However, MSR researchers seem to struggle to characterize the empirical methods they use into the existing empirical SE body of knowledge. This is especially the case of MSR experiments. To provide evidence on the special characteristics of MSR experiments and their differences with experiments traditionally acknowledged in SE so far, we elicited the hallmarks that differentiate an experiment from other types of empirical studies and characterized the hallmarks and types of experiments in MSR. We analyzed MSR literature obtained from a small-scale systematic mapping study to assess the use of the term experiment in MSR. We found that 19% of the papers claiming to be an experiment are indeed not an…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
