Model Class Selection
Ryan Cecil, Lucas Mentch

TL;DR
This paper introduces model class selection (MCS), a framework for evaluating multiple model collections to identify those containing optimal models, enabling comparison between simple interpretable models and complex machine learning models.
Contribution
The paper generalizes model set selection to model class selection, providing data splitting methods for identifying collections with optimal models and enabling formal comparisons of model classes.
Findings
Data splitting approaches effectively identify model classes with optimal models.
Simple interpretable models can perform comparably to complex models on certain datasets.
Experimental results validate the MCS framework on simulated and real data.
Abstract
Classical model selection seeks to find a single model within a particular class that optimizes some pre-specified criteria, such as maximizing a likelihood or minimizing a risk. More recently, there has been an increased interest in model set selection (MSS), where the aim is to identify a (confidence) set of near-optimal models. Here, we generalize the MSS framework further by introducing the idea of model class selection (MCS). In MCS, multiple model collections are evaluated, and all collections that contain at least one optimal model are sought for identification. Under mild conditions, data splitting based approaches are shown to provide general solutions for MCS. As a direct consequence, for particular datasets we are able to investigate formally whether classes of simpler and more interpretable statistical models are able to perform on par with more complex black-box machine…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Imbalanced Data Classification Techniques · Statistical Methods and Inference
