Explanations Leak: Membership Inference with Differential Privacy and Active Learning Defense

Fatima Ezzeddine; Osama Zammar; Silvia Giordano; Omran Ayoub

arXiv:2602.03611·cs.LG·February 4, 2026

Explanations Leak: Membership Inference with Differential Privacy and Active Learning Defense

Fatima Ezzeddine, Osama Zammar, Silvia Giordano, Omran Ayoub

PDF

Open Access

TL;DR

This paper investigates how counterfactual explanations in MLaaS increase privacy risks via membership inference attacks and proposes a defense combining Differential Privacy and Active Learning to mitigate these risks while maintaining utility.

Contribution

It systematically analyzes the privacy risks introduced by explanations and introduces a novel defense framework integrating Differential Privacy with Active Learning.

Findings

01

Exposing CFs via APIs enhances membership inference attacks.

02

The proposed DP and AL defense reduces privacy leakage.

03

Trade-offs exist between privacy, utility, and explanation quality.

Abstract

Counterfactual explanations (CFs) are increasingly integrated into Machine Learning as a Service (MLaaS) systems to improve transparency; however, ML models deployed via APIs are already vulnerable to privacy attacks such as membership inference and model extraction, and the impact of explanations on this threat landscape remains insufficiently understood. In this work, we focus on the problem of how CFs expand the attack surface of MLaaS by strengthening membership inference attacks (MIAs), and on the need to design defense mechanisms that mitigate this emerging risk without undermining utility and explainability. First, we systematically analyze how exposing CFs through query-based APIs enables more effective shadow-based MIAs. Second, we propose a defense framework that integrates Differential Privacy (DP) with Active Learning (AL) to jointly reduce memorization and limit effective…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsExplainable Artificial Intelligence (XAI) · Adversarial Robustness in Machine Learning · Privacy-Preserving Technologies in Data