Interval Abstractions for Robust Counterfactual Explanations

Junqi Jiang; Francesco Leofante; Antonio Rago; Francesca Toni

arXiv:2404.13736·cs.LG·November 25, 2024

Interval Abstractions for Robust Counterfactual Explanations

Junqi Jiang, Francesco Leofante, Antonio Rago, Francesca Toni

PDF

1 Repo

TL;DR

This paper introduces an interval abstraction method that provides provable robustness guarantees for counterfactual explanations in machine learning models, ensuring their validity under a wide range of model changes.

Contribution

The paper proposes a novel interval abstraction technique for parametric models, formalizes a new robustness notion called Δ-robustness, and develops algorithms to verify and generate robust counterfactual explanations.

Findings

01

The approach offers provable robustness guarantees for CEs under extensive model variations.

02

Empirical results demonstrate the effectiveness of the method on neural networks and logistic regression.

03

Benchmarking shows the superiority of the proposed algorithms in generating robust CEs.

Abstract

Counterfactual Explanations (CEs) have emerged as a major paradigm in explainable AI research, providing recourse recommendations for users affected by the decisions of machine learning models. However, CEs found by existing methods often become invalid when slight changes occur in the parameters of the model they were generated for. The literature lacks a way to provide exhaustive robustness guarantees for CEs under model changes, in that existing methods to improve CEs' robustness are mostly heuristic, and the robustness performances are evaluated empirically using only a limited number of retrained models. To bridge this gap, we propose a novel interval abstraction technique for parametric machine learning models, which allows us to obtain provable robustness guarantees for CEs under a possibly infinite set of plausible model changes $Δ$ . Based on this idea, we formalise a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

junqi-jiang/interval-abstractions
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsLogistic Regression · Sparse Evolutionary Training