Semisupervised Classifier Evaluation and Recalibration

Peter Welinder; Max Welling; Pietro Perona

arXiv:1210.2162·cs.LG·October 9, 2012·1 cites

Semisupervised Classifier Evaluation and Recalibration

Peter Welinder, Max Welling, Pietro Perona

PDF

Open Access

TL;DR

This paper introduces SPE, a semisupervised method for estimating classifier performance and recalibration on new datasets with limited labels, using a generative confidence score model.

Contribution

It presents a novel semisupervised approach for performance estimation and recalibration based on a generative model of confidence scores, reducing the need for extensive labeled data.

Findings

01

Accurately estimates performance curves with few labels

02

Provides confidence bounds for performance estimates

03

Enables classifier recalibration using limited labeled data

Abstract

How many labeled examples are needed to estimate a classifier's performance on a new dataset? We study the case where data is plentiful, but labels are expensive. We show that by making a few reasonable assumptions on the structure of the data, it is possible to estimate performance curves, with confidence bounds, using a small number of ground truth labels. Our approach, which we call Semisupervised Performance Evaluation (SPE), is based on a generative model for the classifier's confidence scores. In addition to estimating the performance of classifiers on new datasets, SPE can be used to recalibrate a classifier by re-estimating the class-conditional confidence distributions.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Algorithms · Machine Learning and Data Classification · Algorithms and Data Compression