Sequential algorithmic modification with test data reuse
Jean Feng, Gene Pennello, Nicholas Petrick, Berkman Sahiner, Romain, Pirracchio, Alexej Gossmann

TL;DR
This paper explores advanced statistical methods for validating iterative machine learning model modifications on the same test data, improving power while controlling error rates.
Contribution
It introduces novel extensions of SRGPs that account for correlation between modifications, enhancing test validation in adaptive model updates.
Findings
SRGPs control error rates effectively.
Higher number of beneficial modifications approved.
Improved validation power over previous methods.
Abstract
After initial release of a machine learning algorithm, the model can be fine-tuned by retraining on subsequently gathered data, adding newly discovered features, or more. Each modification introduces a risk of deteriorating performance and must be validated on a test dataset. It may not always be practical to assemble a new dataset for testing each modification, especially when most modifications are minor or are implemented in rapid succession. Recent works have shown how one can repeatedly test modifications on the same dataset and protect against overfitting by (i) discretizing test results along a grid and (ii) applying a Bonferroni correction to adjust for the total number of modifications considered by an adaptive developer. However, the standard Bonferroni correction is overly conservative when most modifications are beneficial and/or highly correlated. This work investigates…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Software Testing and Debugging Techniques · Software Engineering Research
