Interpretable Failure Detection with Human-Level Concepts

Kien X. Nguyen; Tang Li; Xi Peng

arXiv:2502.05275·cs.CV·April 16, 2025

Interpretable Failure Detection with Human-Level Concepts

Kien X. Nguyen, Tang Li, Xi Peng

PDF

Open Access 1 Video

TL;DR

This paper proposes a novel failure detection method using human-level concepts to improve reliability and interpretability of neural networks, significantly reducing false positives in image classification tasks.

Contribution

It introduces a concept-based ranking approach for failure detection that enhances transparency and reduces false positives compared to traditional confidence scores.

Findings

01

Reduces false positive rate by 3.7% on ImageNet

02

Reduces false positive rate by 9% on EuroSAT

03

Provides interpretable failure explanations

Abstract

Reliable failure detection holds paramount importance in safety-critical applications. Yet, neural networks are known to produce overconfident predictions for misclassified samples. As a result, it remains a problematic matter as existing confidence score functions rely on category-level signals, the logits, to detect failures. This research introduces an innovative strategy, leveraging human-level concepts for a dual purpose: to reliably detect when a model fails and to transparently interpret why. By integrating a nuanced array of signals for each category, our method enables a finer-grained assessment of the model's confidence. We present a simple yet highly effective approach based on the ordinal ranking of concept activation to the input image. Without bells and whistles, our method significantly reduce the false positive rate across diverse real-world image classification…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Interpretable Failure Detection with Human-Level Concepts· underline

Taxonomy

TopicsAnomaly Detection Techniques and Applications · Fault Detection and Control Systems · Risk and Safety Analysis