Some Theory For Practical Classifier Validation
Eric Bax, Ya Le

TL;DR
This paper compares two classifier validation methods, SVOOSH and WAG, analyzing their effectiveness and conditions under which WAG can be advantageous, especially with complex models and limited data.
Contribution
It introduces a theoretical comparison between SVOOSH and WAG, highlighting scenarios where WAG offers benefits over traditional validation methods.
Findings
WAG can outperform SVOOSH with complex hypothesis classes.
Limited training data favors the use of WAG.
WAG provides an upper bound on error rate differences.
Abstract
We compare and contrast two approaches to validating a trained classifier while using all in-sample data for training. One is simultaneous validation over an organized set of hypotheses (SVOOSH), the well-known method that began with VC theory. The other is withhold and gap (WAG). WAG withholds a validation set, trains a holdout classifier on the remaining data, uses the validation data to validate that classifier, then adds the rate of disagreement between the holdout classifier and one trained using all in-sample data, which is an upper bound on the difference in error rates. We show that complex hypothesis classes and limited training data can make WAG a favorable alternative.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Algorithms · Machine Learning and Data Classification · Imbalanced Data Classification Techniques
