Confusion Matrices and Accuracy Statistics for Binary Classifiers Using   Unlabeled Data: The Diagnostic Test Approach

Richard Evans

arXiv:2208.12664·stat.ML·December 29, 2022

Confusion Matrices and Accuracy Statistics for Binary Classifiers Using Unlabeled Data: The Diagnostic Test Approach

Richard Evans

PDF

Open Access

TL;DR

This paper adapts diagnostic test methods to estimate confusion matrices and accuracy for binary classifiers on unlabeled data, enabling evaluation without labeled datasets.

Contribution

It introduces a novel approach to estimate confusion matrices and accuracy statistics for classifiers using unlabeled data, extending diagnostic test techniques.

Findings

01

Method successfully estimates confusion matrices without labeled data

02

Applicable to both supervised and unsupervised classifiers

03

Provides a new tool for classifier evaluation in unlabeled settings

Abstract

Medical researchers have solved the problem of estimating the sensitivity and specificity of binary medical diagnostic tests without gold standard tests for comparison. That problem is the same as estimating confusion matrices for classifiers on unlabeled data. This article describes how to modify the diagnostic test solutions to estimate confusion matrices and accuracy statistics for supervised or unsupervised binary classifiers on unlabeled data.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Statistical Methods and Models · Machine Learning and Data Classification

MethodsTest