Stacked Confusion Reject Plots (SCORE)

Stephan Hasler; Lydia Fischer

arXiv:2406.17346·cs.LG·June 26, 2024

Stacked Confusion Reject Plots (SCORE)

Stephan Hasler, Lydia Fischer

PDF

Open Access 1 Repo

TL;DR

SCORE introduces an intuitive visualization tool for understanding classifier uncertainty and rejection trade-offs, improving interpretability over traditional reject curves especially for non-experts.

Contribution

The paper proposes Stacked Confusion Reject Plots (SCORE), a new visualization method that enhances interpretability of classifier rejection behavior in critical applications.

Findings

01

SCORE provides clearer insights into classifier uncertainty.

02

The method is demonstrated on artificial Gaussian data.

03

Code implementation is available as a Python package.

Abstract

Machine learning is more and more applied in critical application areas like health and driver assistance. To minimize the risk of wrong decisions, in such applications it is necessary to consider the certainty of a classification to reject uncertain samples. An established tool for this are reject curves that visualize the trade-off between the number of rejected samples and classification performance metrics. We argue that common reject curves are too abstract and hard to interpret by non-experts. We propose Stacked Confusion Reject Plots (SCORE) that offer a more intuitive understanding of the used data and the classifier's behavior. We present example plots on artificial Gaussian data to document the different options of SCORE and provide the code as a Python package.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

hri-eu/score
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Bayesian Modeling and Causal Inference · Gaussian Processes and Bayesian Inference