Evaluating Membership Inference Attacks in heterogeneous-data setups

Bram van Dartel; Marc Damie; Florian Hahn

arXiv:2502.18986·cs.CR·December 2, 2025

Evaluating Membership Inference Attacks in heterogeneous-data setups

Bram van Dartel, Marc Damie, Florian Hahn

PDF

Open Access

TL;DR

This paper investigates how membership inference attacks perform under heterogeneous data conditions, introducing a new metric for data distribution differences and comparing simulation methodologies to better reflect real-world scenarios.

Contribution

It introduces a novel heterogeneity metric for tabular data distributions and compares two methods for simulating data heterogeneity in MIA evaluations.

Findings

01

MIA accuracy varies significantly with data heterogeneity setup.

02

Current evaluation methods lack standardization for heterogeneous data.

03

Heterogeneity impacts the reliability of privacy risk assessments.

Abstract

Among all privacy attacks against Machine Learning (ML), membership inference attacks (MIA) attracted the most attention. In these attacks, the attacker is given an ML model and a data point, and they must infer whether the data point was used for training. The attacker also has an auxiliary dataset to tune their inference algorithm. Attack papers commonly simulate setups in which the attacker's and the target's datasets are sampled from the same distribution. This setting is convenient to perform experiments, but it rarely holds in practice. ML literature commonly starts with similar simplifying assumptions (i.e., "i.i.d." datasets), and later generalizes the results to support heterogeneous data distributions. Similarly, our work makes a first step in the generalization of the MIA evaluation to heterogeneous data. First, we design a metric to measure the heterogeneity between any…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPrivacy-Preserving Technologies in Data · Adversarial Robustness in Machine Learning · Explainable Artificial Intelligence (XAI)