Detecting and Correcting for Label Shift with Black Box Predictors
Zachary C. Lipton, Yu-Xiang Wang, Alex Smola

TL;DR
This paper introduces BBSE, a method for detecting and correcting label shift in classifiers using black box predictors, effective even with biased or inaccurate predictors, with proven consistency and practical success.
Contribution
The paper proposes BBSE, a novel approach for estimating and correcting label shift using black box predictors, with theoretical guarantees and applicability to high-dimensional data.
Findings
BBSE accurately estimates label distribution shifts.
BBSE improves classifier performance under label shift.
Method is effective on high-dimensional image datasets.
Abstract
Faced with distribution shift between training and test set, we wish to detect and quantify the shift, and to correct our classifiers without test set labels. Motivated by medical diagnosis, where diseases (targets) cause symptoms (observations), we focus on label shift, where the label marginal changes but the conditional does not. We propose Black Box Shift Estimation (BBSE) to estimate the test distribution . BBSE exploits arbitrary black box predictors to reduce dimensionality prior to shift correction. While better predictors give tighter estimates, BBSE works even when predictors are biased, inaccurate, or uncalibrated, so long as their confusion matrices are invertible. We prove BBSE's consistency, bound its error, and introduce a statistical test that uses BBSE to detect shift. We also leverage BBSE to correct classifiers. Experiments demonstrate accurate…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Zack Chase Lipton — The Medical Machine Learning Landscape· youtube
Taxonomy
TopicsImage Retrieval and Classification Techniques · Machine Learning and Data Classification · Face and Expression Recognition
