Membership Inference Attacks from Causal Principles
Mathieu Even, Cl\'ement Berenfeld, Linus Bleistein, Tudor Cebere, Julie Josse, Aur\'elien Bellet

TL;DR
This paper introduces a causal inference framework for evaluating membership inference attacks, enabling reliable privacy risk assessment without extensive retraining, especially useful for large models and under distribution shifts.
Contribution
It formalizes MIA evaluation as a causal problem, identifying biases in existing methods and proposing consistent estimators for various evaluation regimes.
Findings
Causal formulation reveals biases in current MIA evaluation methods.
Proposed estimators provide reliable memorization measurement with limited retraining.
Approach remains effective under distribution shift.
Abstract
Membership Inference Attacks (MIAs) are widely used to quantify training data memorization and assess privacy risks. Standard evaluation requires repeated retraining, which is computationally costly for large models. One-run methods (single training with randomized data inclusion) and zero-run methods (post hoc evaluation) are often used instead, though their statistical validity remains unclear. To address this gap, we frame MIA evaluation as a causal inference problem, defining memorization as the causal effect of including a data point in the training set. This novel formulation reveals and formalizes key sources of bias in existing protocols: one-run methods suffer from interference between jointly included points, while zero-run evaluations popular for LLMs are confounded by non-random membership assignment. We derive causal analogues of standard MIA metrics and propose practical…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Adversarial Robustness in Machine Learning · Advanced Graph Neural Networks
