Learning to predict test effectiveness
Morteza Zakeri-Nasrabadi, Saeed Parsa

TL;DR
This paper introduces a machine learning approach to predict test coverage effectiveness using source code metrics, significantly reducing testing costs by accurately estimating class coverageability.
Contribution
It presents a novel ensemble regression model and a new metric called Coverageability, improving prediction accuracy over existing models.
Findings
The model achieved an MAE of 0.032 and R2 of 0.855 on Java classes.
Class strict cyclomatic complexity is the most influential feature.
The proposed model outperforms state-of-the-art coverage prediction models by up to 20.71% in R2-score.
Abstract
The high cost of the test can be dramatically reduced, provided that the coverability as an inherent feature of the code under test is predictable. This article offers a machine learning model to predict the extent to which the test could cover a class in terms of a new metric called Coverageability. The prediction model consists of an ensemble of four regression models. The learning samples consist of feature vectors, where features are source code metrics computed for a class. The samples are labeled by the Coverageability values computed for their corresponding classes. We offer a mathematical model to evaluate test effectiveness in terms of size and coverage of the test suite generated automatically for each class. We extend the size of the feature space by introducing a new approach to defining sub-metrics in terms of existing source code metrics. Using feature importance analysis…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsTest · Masked autoencoder
