Finding Patterns in Ambiguity: Interpretable Stress Testing in the   Decision~Boundary

In\^es Gomes; Lu\'is F. Teixeira; Jan N. van Rijn; Carlos Soares,; Andr\'e Restivo; Lu\'is Cunha; Mois\'es Santos

arXiv:2408.06302·cs.LG·August 13, 2024

Finding Patterns in Ambiguity: Interpretable Stress Testing in the Decision~Boundary

In\^es Gomes, Lu\'is F. Teixeira, Jan N. van Rijn, Carlos Soares,, Andr\'e Restivo, Lu\'is Cunha, Mois\'es Santos

PDF

Open Access 1 Repo

TL;DR

This paper introduces a method to interpret deep binary classifiers by selecting representative boundary prototypes and applying explanation algorithms, improving understanding of low-confidence decisions.

Contribution

It proposes a novel approach combining prototype selection and post-model explanations to enhance interpretability of decision boundaries in deep classifiers.

Findings

01

Revealed distinct, compact clusters of prototypes.

02

Captured essential features leading to low-confidence decisions.

03

Demonstrated effectiveness through visualizations and GradientSHAP analysis.

Abstract

The increasing use of deep learning across various domains highlights the importance of understanding the decision-making processes of these black-box models. Recent research focusing on the decision boundaries of deep classifiers, relies on generated synthetic instances in areas of low confidence, uncovering samples that challenge both models and humans. We propose a novel approach to enhance the interpretability of deep binary classifiers by selecting representative samples from the decision boundary - prototypes - and applying post-model explanation algorithms. We evaluate the effectiveness of our approach through 2D visualizations and GradientSHAP analysis. Our experiments demonstrate the potential of the proposed method, revealing distinct and compact clusters and diverse prototypes that capture essential features that lead to low-confidence decisions. By offering a more aggregated…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

inesgomes/db-patterns
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsExplainable Artificial Intelligence (XAI) · Generative Adversarial Networks and Image Synthesis · Adversarial Robustness in Machine Learning