The privacy issue of counterfactual explanations: explanation linkage   attacks

Sofie Goethals; Kenneth S\"orensen; David Martens

arXiv:2210.12051·cs.LG·October 24, 2022

The privacy issue of counterfactual explanations: explanation linkage attacks

Sofie Goethals, Kenneth S\"orensen, David Martens

PDF

Open Access

TL;DR

This paper identifies privacy risks in explainable AI, specifically explanation linkage attacks on counterfactual explanations, and proposes k-anonymous counterfactuals with a new metric to enhance privacy without compromising explanation quality.

Contribution

It introduces the explanation linkage attack and proposes k-anonymous counterfactual explanations along with the pureness metric to improve privacy in XAI.

Findings

01

k-anonymous counterfactuals improve privacy

02

Pureness metric effectively evaluates explanation validity

03

Privacy preservation enhances explanation quality

Abstract

Black-box machine learning models are being used in more and more high-stakes domains, which creates a growing need for Explainable AI (XAI). Unfortunately, the use of XAI in machine learning introduces new privacy risks, which currently remain largely unnoticed. We introduce the explanation linkage attack, which can occur when deploying instance-based strategies to find counterfactual explanations. To counter such an attack, we propose k-anonymous counterfactual explanations and introduce pureness as a new metric to evaluate the validity of these k-anonymous counterfactual explanations. Our results show that making the explanations, rather than the whole dataset, k- anonymous, is beneficial for the quality of the explanations.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsExplainable Artificial Intelligence (XAI) · Adversarial Robustness in Machine Learning · Machine Learning in Healthcare