Two-step counterfactual generation for OOD examples

Nawid Keshtmand; Raul Santos-Rodriguez; Jonathan Lawry

arXiv:2302.05196·cs.LG·February 13, 2023

Two-step counterfactual generation for OOD examples

Nawid Keshtmand, Raul Santos-Rodriguez, Jonathan Lawry

PDF

Open Access

TL;DR

This paper introduces a novel approach for explaining why models classify data points as out-of-distribution by generating counterfactual examples that transition between OOD categories, enhancing interpretability in safety-critical AI systems.

Contribution

The paper presents a new method for generating OOD counterfactuals, addressing the gap in explainability for OOD detection in machine learning models.

Findings

01

Effective generation of OOD counterfactuals demonstrated on synthetic and benchmark datasets.

02

Comparison shows advantages over existing methods in interpretability metrics.

03

Provides insights into model decision boundaries for OOD detection.

Abstract

Two fundamental requirements for the deployment of machine learning models in safety-critical systems are to be able to detect out-of-distribution (OOD) data correctly and to be able to explain the prediction of the model. Although significant effort has gone into both OOD detection and explainable AI, there has been little work on explaining why a model predicts a certain data point is OOD. In this paper, we address this question by introducing the concept of an OOD counterfactual, which is a perturbed data point that iteratively moves between different OOD categories. We propose a method for generating such counterfactuals, investigate its application on synthetic and benchmark data, and compare it to several benchmark methods using a range of metrics.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Explainable Artificial Intelligence (XAI) · Anomaly Detection Techniques and Applications