Revisiting the size effect in software fault prediction models
Amjed Tahir, Kwabena E. Bennin, Stephen G. MacDonell, Stephen, Marsland

TL;DR
This study rigorously tests whether class size influences the relationship between object-oriented metrics and fault proneness, finding no consistent evidence of a significant mediation or moderation effect across multiple systems.
Contribution
It applies robust statistical methods to clarify the role of class size in fault prediction models, challenging previous assumptions about its significance.
Findings
No strong evidence of mediation or moderation effects of class size.
Size influences some metrics like CBO and Fan-out, but inconsistently.
Size affects WMC and CBO as a moderator in most systems.
Abstract
BACKGROUND: In object oriented (OO) software systems, class size has been acknowledged as having an indirect effect on the relationship between certain artifact characteristics, captured via metrics, and faultproneness, and therefore it is recommended to control for size when designing fault prediction models. AIM: To use robust statistical methods to assess whether there is evidence of any true effect of class size on fault prediction models. METHOD: We examine the potential mediation and moderation effects of class size on the relationships between OO metrics and number of faults. We employ regression analysis and bootstrapping-based methods to investigate the mediation and moderation effects in two widely-used datasets comprising seventeen systems. RESULTS: We find no strong evidence of a significant mediation or moderation effect of class size on the relationships between OO metrics…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
