Will My Tests Tell Me If I Break This Code?
Rainer Niedermayr, Elmar Juergens, Stefan Wagner

TL;DR
This paper empirically evaluates the effectiveness of code coverage as a measure of test suite quality, revealing its limitations especially for system tests through mutation testing analysis.
Contribution
It demonstrates that code coverage is a valid effectiveness indicator only for unit tests, not for system tests, based on an empirical mutation testing study.
Findings
Code coverage correlates with test effectiveness for unit tests.
High coverage does not guarantee fault detection in system tests.
Pseudo-tested methods are prevalent in system tests, reducing coverage reliability.
Abstract
Automated tests play an important role in software evolution because they can rapidly detect faults introduced during changes. In practice, code-coverage metrics are often used as criteria to evaluate the effectiveness of test suites with focus on regression faults. However, code coverage only expresses which portion of a system has been executed by tests, but not how effective the tests actually are in detecting regression faults. Our goal was to evaluate the validity of code coverage as a measure for test effectiveness. To do so, we conducted an empirical study in which we applied an extreme mutation testing approach to analyze the tests of open-source projects written in Java. We assessed the ratio of pseudo-tested methods (those tested in a way such that faults would not be detected) to all covered methods and judged their impact on the software project. The results show that the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
