On the interplay of Explainability, Privacy and Predictive Performance with Explanation-assisted Model Extraction

Fatima Ezzeddine; Rinad Akel; Ihab Sbeity; Silvia Giordano; Marc Langheinrich; Omran Ayoub

arXiv:2505.08847·cs.CR·May 15, 2025

On the interplay of Explainability, Privacy and Predictive Performance with Explanation-assisted Model Extraction

Fatima Ezzeddine, Rinad Akel, Ihab Sbeity, Silvia Giordano, Marc Langheinrich, Omran Ayoub

PDF

Open Access

TL;DR

This paper explores how explainability, privacy, and predictive performance interact in machine learning models, especially focusing on how differential privacy can mitigate model extraction attacks facilitated by explanations.

Contribution

It investigates the trade-offs between model performance, privacy, and explainability, proposing differential privacy strategies during training and explanation generation to counteract attacks.

Findings

01

Differential privacy can reduce the risk of model extraction attacks.

02

Trade-offs exist between privacy guarantees and model accuracy.

03

Privacy during explanation generation impacts attack success.

Abstract

Machine Learning as a Service (MLaaS) has gained important attraction as a means for deploying powerful predictive models, offering ease of use that enables organizations to leverage advanced analytics without substantial investments in specialized infrastructure or expertise. However, MLaaS platforms must be safeguarded against security and privacy attacks, such as model extraction (MEA) attacks. The increasing integration of explainable AI (XAI) within MLaaS has introduced an additional privacy challenge, as attackers can exploit model explanations particularly counterfactual explanations (CFs) to facilitate MEA. In this paper, we investigate the trade offs among model performance, privacy, and explainability when employing Differential Privacy (DP), a promising technique for mitigating CF facilitated MEA. We evaluate two distinct DP strategies: implemented during the classification…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsExplainable Artificial Intelligence (XAI)

Methodstravel james