Model Reconstruction Using Counterfactual Explanations: A Perspective   From Polytope Theory

Pasan Dissanayake; Sanghamitra Dutta

arXiv:2405.05369·cs.LG·November 13, 2024·1 cites

Model Reconstruction Using Counterfactual Explanations: A Perspective From Polytope Theory

Pasan Dissanayake, Sanghamitra Dutta

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces a novel theoretical framework based on polytope theory to improve model reconstruction using counterfactual explanations, leading to a new attack strategy called CCA that enhances fidelity and reduces decision boundary shift.

Contribution

It derives theoretical relationships between reconstruction error and counterfactual queries, and proposes the CCA method for more accurate model reconstruction.

Findings

01

CCA improves fidelity between models

02

Theoretical bounds relate reconstruction error to counterfactual queries

03

Approach reduces decision boundary shift issues

Abstract

Counterfactual explanations provide ways of achieving a favorable model outcome with minimum input perturbation. However, counterfactual explanations can also be leveraged to reconstruct the model by strategically training a surrogate model to give similar predictions as the original (target) model. In this work, we analyze how model reconstruction using counterfactuals can be improved by further leveraging the fact that the counterfactuals also lie quite close to the decision boundary. Our main contribution is to derive novel theoretical relationships between the error in model reconstruction and the number of counterfactual queries required using polytope theory. Our theoretical analysis leads us to propose a strategy for model reconstruction that we call Counterfactual Clamping Attack (CCA) which trains a surrogate model using a unique loss function that treats counterfactuals…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

pasandissanayake/model-reconstruction-using-counterfactuals
pytorchOfficial

Videos

Model Reconstruction Using Counterfactual Explanations: A Perspective From Polytope Theory· slideslive

Taxonomy

TopicsScientific Computing and Data Management

MethodsCounterfactuals Explanations