Provably Robust and Plausible Counterfactual Explanations for Neural   Networks via Robust Optimisation

Junqi Jiang; Jianglin Lan; Francesco Leofante; Antonio Rago; Francesca; Toni

arXiv:2309.12545·cs.LG·April 5, 2024·2 cites

Provably Robust and Plausible Counterfactual Explanations for Neural Networks via Robust Optimisation

Junqi Jiang, Jianglin Lan, Francesco Leofante, Antonio Rago, Francesca, Toni

PDF

Open Access 1 Repo

TL;DR

This paper introduces PROPLACE, a novel method for generating counterfactual explanations for neural networks that are provably robust, plausible, and optimized for closeness, addressing limitations of existing approaches.

Contribution

PROPLACE is the first method to simultaneously optimize for robustness, plausibility, and closeness in counterfactual explanations with proven convergence and soundness.

Findings

01

PROPLACE outperforms five baseline methods on robustness, plausibility, and closeness metrics.

02

The iterative algorithm for PROPLACE converges and guarantees soundness and completeness.

03

Experimental results demonstrate state-of-the-art performance across multiple evaluation metrics.

Abstract

Counterfactual Explanations (CEs) have received increasing interest as a major methodology for explaining neural network classifiers. Usually, CEs for an input-output pair are defined as data points with minimum distance to the input that are classified with a different label than the output. To tackle the established problem that CEs are easily invalidated when model parameters are updated (e.g. retrained), studies have proposed ways to certify the robustness of CEs under model parameter changes bounded by a norm ball. However, existing methods targeting this form of robustness are not sound or complete, and they may generate implausible CEs, i.e., outliers wrt the training dataset. In fact, no existing method simultaneously optimises for closeness and plausibility while preserving robustness guarantees. In this work, we propose Provably RObust and PLAusible Counterfactual Explanations…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

junqi-jiang/proplace
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsExplainable Artificial Intelligence (XAI) · Adversarial Robustness in Machine Learning · Machine Learning in Materials Science