Detecting Errors and Estimating Accuracy on Unlabeled Data with Self-training Ensembles
Jiefeng Chen, Frederick Liu, Besim Avci, Xi Wu, Yingyu Liang, Somesh, Jha

TL;DR
This paper introduces a framework combining ensemble learning and self-training to accurately estimate model performance and detect errors on unlabeled data, with theoretical guarantees and state-of-the-art results.
Contribution
The paper presents a novel, theoretically grounded framework that jointly addresses unsupervised accuracy estimation and error detection using self-training ensembles.
Findings
Achieved at least 70% reduction in accuracy estimation error on iWildCam
Improved F1 score for error detection by at least 4.7%
Demonstrated state-of-the-art results on 59 diverse tasks
Abstract
When a deep learning model is deployed in the wild, it can encounter test data drawn from distributions different from the training data distribution and suffer drop in performance. For safe deployment, it is essential to estimate the accuracy of the pre-trained model on the test data. However, the labels for the test inputs are usually not immediately available in practice, and obtaining them can be expensive. This observation leads to two challenging tasks: (1) unsupervised accuracy estimation, which aims to estimate the accuracy of a pre-trained classifier on a set of unlabeled test inputs; (2) error detection, which aims to identify mis-classified test inputs. In this paper, we propose a principled and practically effective framework that simultaneously addresses the two tasks. The proposed framework iteratively learns an ensemble of models to identify mis-classified data points and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications · Machine Learning and Data Classification
