Mind the Gap: The Difference Between Coverage and Mutation Score Can Guide Testing Efforts
Kush Jain, Goutamkumar Tulajappa Kalburgi, Claire Le Goues, Alex Groce

TL;DR
This paper introduces the concept of the oracle gap, the difference between code coverage and mutation score, as a new way to evaluate and guide software testing efforts, supported by large-scale empirical studies.
Contribution
It proposes the oracle gap framework and demonstrates its usefulness in identifying weak testing areas through extensive empirical analysis.
Findings
Oracle gap reveals critical testing deficiencies not shown by coverage or mutation score alone.
Large-scale studies across Maven projects show the oracle gap's effectiveness in assessing test quality.
Analysis of blockchain projects highlights the oracle gap's relevance in highly critical software.
Abstract
An "adequate" test suite should effectively find all inconsistencies between a system's requirements/specifications and its implementation. Practitioners frequently use code coverage to approximate adequacy, while academics argue that mutation score may better approximate true (oracular) adequacy coverage. High code coverage is increasingly attainable even on large systems via automatic test generation, including fuzzing. In light of all of these options for measuring and improving testing effort, how should a QA engineer spend their time? We propose a new framework for reasoning about the extent, limits, and nature of a given testing effort based on an idea we call the oracle gap, or the difference between source code coverage and mutation score for a given software element. We conduct (1) a large-scale observational study of the oracle gap across popular Maven projects, (2) a study…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Testing and Debugging Techniques · Software Engineering Research · Software System Performance and Reliability
