Systematic Training and Testing for Machine Learning Using Combinatorial Interaction Testing
Tyler Cody, Erin Lanus, Daniel D. Doyle, Laura Freeman

TL;DR
This paper explores the use of combinatorial coverage to systematically select and evaluate training and testing data for machine learning models, demonstrating its effectiveness in stress testing, robustness, and domain adaptation using the MNIST dataset.
Contribution
It adapts combinatorial interaction testing for data selection in machine learning, focusing on input-output features rather than model internals, and addresses prior criticisms of coverage methods.
Findings
Combinatorial coverage can identify stress points in ML models.
Training sets selected via combinatorial coverage improve model robustness.
Coverage metrics are effective even without access to model internals.
Abstract
This paper demonstrates the systematic use of combinatorial coverage for selecting and characterizing test and training sets for machine learning models. The presented work adapts combinatorial interaction testing, which has been successfully leveraged in identifying faults in software testing, to characterize data used in machine learning. The MNIST hand-written digits data is used to demonstrate that combinatorial coverage can be used to select test sets that stress machine learning model performance, to select training sets that lead to robust model performance, and to select data for fine-tuning models to new domains. Thus, the results posit combinatorial coverage as a holistic approach to training and testing for machine learning. In contrast to prior work which has focused on the use of coverage in regard to the internal of neural networks, this paper considers coverage over…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Software Testing and Debugging Techniques · Machine Learning and Data Classification
