Not Your Grandfathers Test Set: Reducing Labeling Effort for Testing
Begum Taskazan, Jiri Navratil, Matthew Arnold, Anupama Murthi, Ganesh, Venkataraman, Benjamin Elder

TL;DR
This paper introduces a technique that significantly reduces the effort required to create and maintain high-quality test sets, addressing issues of cost, drift, and outdated data in real-world testing scenarios.
Contribution
It presents a simple, effective method to cut labeling effort by up to 100%, enabling more sustainable and up-to-date testing practices.
Findings
Labeling effort reduced by 80-100%
Test set quality maintained or improved
Addresses test set drift and maintenance challenges
Abstract
Building and maintaining high-quality test sets remains a laborious and expensive task. As a result, test sets in the real world are often not properly kept up to date and drift from the production traffic they are supposed to represent. The frequency and severity of this drift raises serious concerns over the value of manually labeled test sets in the QA process. This paper proposes a simple but effective technique that drastically reduces the effort needed to construct and maintain a high-quality test set (reducing labeling effort by 80-100% across a range of practical scenarios). This result encourages a fundamental rethinking of the testing process by both practitioners, who can use these techniques immediately to improve their testing, and researchers who can help address many of the open questions raised by this new approach.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Testing and Debugging Techniques · Machine Learning and Algorithms · VLSI and Analog Circuit Testing
