Detecting Errors and Estimating Accuracy on Unlabeled Data with   Self-training Ensembles

Jiefeng Chen; Frederick Liu; Besim Avci; Xi Wu; Yingyu Liang; Somesh; Jha

arXiv:2106.15728·cs.LG·May 16, 2023·6 cites

Detecting Errors and Estimating Accuracy on Unlabeled Data with Self-training Ensembles

Jiefeng Chen, Frederick Liu, Besim Avci, Xi Wu, Yingyu Liang, Somesh, Jha

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces a framework combining ensemble learning and self-training to accurately estimate model performance and detect errors on unlabeled data, with theoretical guarantees and state-of-the-art results.

Contribution

The paper presents a novel, theoretically grounded framework that jointly addresses unsupervised accuracy estimation and error detection using self-training ensembles.

Findings

01

Achieved at least 70% reduction in accuracy estimation error on iWildCam

02

Improved F1 score for error detection by at least 4.7%

03

Demonstrated state-of-the-art results on 59 diverse tasks

Abstract

When a deep learning model is deployed in the wild, it can encounter test data drawn from distributions different from the training data distribution and suffer drop in performance. For safe deployment, it is essential to estimate the accuracy of the pre-trained model on the test data. However, the labels for the test inputs are usually not immediately available in practice, and obtaining them can be expensive. This observation leads to two challenging tasks: (1) unsupervised accuracy estimation, which aims to estimate the accuracy of a pre-trained classifier on a set of unlabeled test inputs; (2) error detection, which aims to identify mis-classified test inputs. In this paper, we propose a principled and practically effective framework that simultaneously addresses the two tasks. The proposed framework iteratively learns an ensemble of models to identify mis-classified data points and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

jfc43/self-training-ensembles
pytorchOfficial

Videos

Detecting Errors and Estimating Accuracy on Unlabeled Data with Self-training Ensembles· slideslive

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications · Machine Learning and Data Classification