When to Accept Automated Predictions and When to Defer to Human   Judgment?

Daniel Sikar; Artur Garcez; Tillman Weyde; Robin Bloomfield; Kaleem; Peeroo

arXiv:2407.07821·cs.LG·August 14, 2024

When to Accept Automated Predictions and When to Defer to Human Judgment?

Daniel Sikar, Artur Garcez, Tillman Weyde, Robin Bloomfield, Kaleem, Peeroo

PDF

Open Access

TL;DR

This paper introduces a clustering-based metric to assess the reliability of neural network predictions under distribution shifts, helping decide when to trust automated decisions or defer to humans.

Contribution

It proposes a novel distance-based confidence measure using clustering of neural network outputs to evaluate prediction safety during distribution shifts.

Findings

01

The proposed metric correlates with prediction correctness under distribution shifts.

02

It is effective across different datasets and neural network architectures.

03

The approach can guide when to accept automated predictions or defer to humans.

Abstract

Ensuring the reliability and safety of automated decision-making is crucial. It is well-known that data distribution shifts in machine learning can produce unreliable outcomes. This paper proposes a new approach for measuring the reliability of predictions under distribution shifts. We analyze how the outputs of a trained neural network change using clustering to measure distances between outputs and class centroids. We propose this distance as a metric to evaluate the confidence of predictions under distribution shifts. We assign each prediction to a cluster with centroid representing the mean softmax output for all correct predictions of a given class. We then define a safety threshold for a class as the smallest distance from an incorrect prediction to the given class centroid. We evaluate the approach on the MNIST and CIFAR-10 datasets using a Convolutional Neural Network and a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsForecasting Techniques and Applications

MethodsAttention Is All You Need · Residual Connection · Byte Pair Encoding · Layer Normalization · Linear Layer · Label Smoothing · Adam · Dropout · Dense Connections · Absolute Position Encodings