Using theoretical ROC curves for analysing machine learning binary   classifiers

Luma Omar; Ioannis Ivrissimtzis

arXiv:1909.09816·cs.LG·September 24, 2019

Using theoretical ROC curves for analysing machine learning binary classifiers

Luma Omar, Ioannis Ivrissimtzis

PDF

TL;DR

This paper explores the use of theoretical ROC curves derived from fitted probability distributions to analyze binary classifiers, providing insights beyond empirical performance measures.

Contribution

It introduces a method to fit theoretical distributions to classifier responses and analyze ROC curves for better understanding of classifier behavior.

Findings

01

Beta distributions effectively model classifier responses.

02

Theoretical ROC analysis reveals extremal behaviors at ROC curve ends.

03

Fitting distributions aids in classifier performance interpretation.

Abstract

Most binary classifiers work by processing the input to produce a scalar response and comparing it to a threshold value. The various measures of classifier performance assume, explicitly or implicitly, probability distributions $P_{s}$ and $P_{n}$ of the response belonging to either class, probability distributions for the cost of each type of misclassification, and compute a performance score from the expected cost. In machine learning, classifier responses are obtained experimentally and performance scores are computed directly from them, without any assumptions on $P_{s}$ and $P_{n}$ . Here, we argue that the omitted step of estimating theoretical distributions for $P_{s}$ and $P_{n}$ can be useful. In a biometric security example, we fit beta distributions to the responses of two classifiers, one based on logistic regression and one on ANNs, and use them to establish a categorisation into a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsLogistic Regression