Consensus-Driven Active Model Selection
Justin Kay, Grant Van Horn, Subhransu Maji, Daniel Sheldon, and Sara Beery

TL;DR
This paper introduces CODA, a probabilistic, consensus-driven active model selection method that efficiently identifies the best machine learning model with significantly less annotation effort by leveraging model disagreements.
Contribution
We develop CODA, a novel active model selection framework that models classifier relationships and uses Bayesian inference to reduce labeling costs in model selection tasks.
Findings
CODA reduces annotation effort by over 70% compared to previous methods.
It outperforms existing active model selection techniques across 26 benchmark tasks.
The probabilistic framework effectively captures model consensus and disagreement.
Abstract
The widespread availability of off-the-shelf machine learning models poses a challenge: which model, of the many available candidates, should be chosen for a given data analysis task? This question of model selection is traditionally answered by collecting and annotating a validation dataset -- a costly and time-intensive process. We propose a method for active model selection, using predictions from candidate models to prioritize the labeling of test data points that efficiently differentiate the best candidate. Our method, CODA, performs consensus-driven active model selection by modeling relationships between classifiers, categories, and data points within a probabilistic framework. The framework uses the consensus and disagreement between models in the candidate pool to guide the label acquisition process, and Bayesian inference to update beliefs about which model is best as more…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Control Systems Optimization · Advanced Data Processing Techniques · Fault Detection and Control Systems
