Empirical Confirmation (and Refutation) of Presumptions on Software
Joseph Gil, Maayan Goldstein, Dany Moshkovich

TL;DR
This paper uses statistical analysis to empirically test common assumptions about software metrics, confirming some and revealing surprising behaviors such as metric reliability in random changes and frequent Boolean flips in minor updates.
Contribution
It provides empirical evidence on the validity of software metrics and uncovers unexpected behaviors, challenging assumptions about their reliability and correlation with software versioning.
Findings
Most presumptions on software metrics are confirmed.
Some metrics show reliability even in random architecture changes.
Boolean metrics flip more often in minor than major version updates.
Abstract
Code metrics are easy to define, but not so easy to justify. It is hard to prove that a metric is valid, i.e., that measured numerical values imply anything on the vaguely defined, yet crucial software properties such as complexity and maintainability. This paper employs statistical analysis and tests to check some "believable" presumptions on the behavior of software and metrics measured for this software. Among those are the reliability presumption implicit in the application of any code metric, and the presumption that the magnitude of change in a software artifact is correlated with changes to its version number. Putting a suite of 36 metrics to the trial, we confirm most of the presumptions. Unexpectedly, we show that a substantial portion of the reliability of some metrics can be observed even in random changes to architecture. Another surprising result is that Boolean-valued…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Software Reliability and Analysis Research · Data Quality and Management
