TL;DR
This paper evaluates the effectiveness of PIT's mutants in mutation testing, revealing that its limited mutants miss many faults and are less effective than comprehensive mutants, impacting testing quality.
Contribution
It provides an empirical assessment of PIT's mutants versus comprehensive mutants, highlighting limitations and suggesting improvements for mutation testing practices.
Findings
PIT's mutants are less effective at fault detection than comprehensive mutants.
Comprehensive mutants can detect faults missed by PIT's mutants in 11-62% of classes.
Using comprehensive mutants could improve mutation testing effectiveness.
Abstract
Mutation testing is used extensively to support the experimentation of software engineering studies. Its application to real-world projects is possible thanks to modern tools that automate the whole mutation analysis process. However, popular mutation testing tools use a restrictive set of mutants which do not conform to the community standards as supported by the mutation testing literature. This can be problematic since the effectiveness of mutation depends on its mutants. We therefore examine how effective are the mutants of a popular mutation testing tool, named PIT, compared to comprehensive ones, as drawn from the literature and personal experience. We show that comprehensive mutants are harder to kill and encode faults not captured by the mutants of PIT for a range of 11% to 62% of the Java classes of the considered projects.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
