Assessing Practitioner Beliefs about Software Defect Prediction
N.C. Shrikanth, Tim Menzies

TL;DR
This study investigates the validity of common beliefs about software defect prediction by analyzing extensive open-source project data, revealing that many beliefs are only sporadically supported and emphasizing the need to consider temporal changes in effects.
Contribution
It provides empirical evidence on the support for ten prevalent beliefs in software defect prediction and highlights the importance of reporting how effects evolve over time.
Findings
Strong support for some beliefs like commit size and bug-proneness.
Most beliefs are only sporadically supported across projects and releases.
Effects often change or disappear over time, affecting belief validity.
Abstract
Just because software developers say they believe in "X", that does not necessarily mean that "X" is true. As shown here, there exist numerous beliefs listed in the recent Software Engineering literature which are only supported by small portions of the available data. Hence we ask what is the source of this disconnect between beliefs and evidence?. To answer this question we look for evidence for ten beliefs within 300,000+ changes seen in dozens of open-source projects. Some of those beliefs had strong support across all the projects; specifically, "A commit that involves more added and removed lines is more bug-prone" and "Files with fewer lines contributed by their owners (who contribute most changes) are bug-prone". Most of the widely-held beliefs studied are only sporadically supported in the data; i.e. large effects can appear in project data and then disappear in subsequent…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
