Towards integration of Privacy Enhancing Technologies in Explainable Artificial Intelligence
Sonal Allana, Rozita Dara, Xiaodong Lin, Pulei Xiong

TL;DR
This paper investigates integrating Privacy Enhancing Technologies into Explainable AI to defend against privacy attacks, evaluating their effectiveness and impact on system utility and explanation quality.
Contribution
It introduces and empirically evaluates three PETs methods for protecting privacy in feature-based XAI explanations, addressing a critical security gap.
Findings
PETs reduced privacy attack success by up to 49.47%
Different PETs have varying impacts on utility and performance
Strategies for effective PETs integration in XAI are proposed
Abstract
Explainable Artificial Intelligence (XAI) is a crucial pathway in mitigating the risk of non-transparency in the decision-making process of black-box Artificial Intelligence (AI) systems. However, despite the benefits, XAI methods are found to leak the privacy of individuals whose data is used in training or querying the models. Researchers have demonstrated privacy attacks that exploit explanations to infer sensitive personal information of individuals. Currently there is a lack of defenses against known privacy attacks targeting explanations when vulnerable XAI are used in production and machine learning as a service system. To address this gap, in this article, we explore Privacy Enhancing Technologies (PETs) as a defense mechanism against attribute inference on explanations provided by feature-based XAI methods. We empirically evaluate 3 types of PETs, namely synthetic training…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
