Keeping Mutation Test Suites Consistent and Relevant with Long-Standing Mutants
Milos Ojdanic, Mike Papadakis, Mark Harman

TL;DR
This paper highlights the importance of maintaining long-standing mutant suites in mutation testing to ensure consistency and relevance over time, proposing a new approach that significantly improves mutant suite relevance.
Contribution
It introduces a mutant brittleness measure and demonstrates how to identify long-standing mutant suites with higher relevance, challenging the common practice of re-computing mutants each release.
Findings
52% of mutants degrade relevance over time
Long-standing mutant suites can be 10x more relevant
Recomputing mutants reduces test suite effectiveness
Abstract
Mutation testing has been demonstrated to be one of the most powerful fault-revealing tools in the tester's tool kit. Much previous work implicitly assumed it to be sufficient to re-compute mutant suites per release. Sadly, this makes mutation results inconsistent; mutant scores from each release cannot be directly compared, making it harder to measure test improvement. Furthermore, regular code change means that a mutant suite's relevance will naturally degrade over time. We measure this degradation in relevance for 143,500 mutants in 4 non-trivial systems finding that, on overage, 52% degrade. We introduce a mutant brittleness measure and use it to audit software systems and their mutation suites. We also demonstrate how consistent-by-construction long-standing mutant suites can be identified with a 10x improvement in mutant relevance over an arbitrary test suite. Our results indicate…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Testing and Debugging Techniques · Radiation Effects in Electronics · Software System Performance and Reliability
