Stolen Memories: Leveraging Model Memorization for Calibrated White-Box Membership Inference
Klas Leino, Matt Fredrikson

TL;DR
This paper introduces a white-box membership inference attack that exploits model internals and overfitting patterns to accurately determine if data was in the training set, outperforming prior black-box methods and challenging existing defenses.
Contribution
The work presents a novel white-box MI attack leveraging model internals and overfitting insights, with improved confidence calibration and analysis of defense limitations.
Findings
White-box attack outperforms black-box methods in accuracy.
Calibrated attack achieves high precision in membership inference.
Existing defenses like differential privacy have limited effectiveness.
Abstract
Membership inference (MI) attacks exploit the fact that machine learning algorithms sometimes leak information about their training data through the learned model. In this work, we study membership inference in the white-box setting in order to exploit the internals of a model, which have not been effectively utilized by previous work. Leveraging new insights about how overfitting occurs in deep neural networks, we show how a model's idiosyncratic use of features can provide evidence for membership to white-box attackers---even when the model's black-box behavior appears to generalize well---and demonstrate that this attack outperforms prior black-box methods. Taking the position that an effective attack should have the ability to provide confident positive inferences, we find that previous attacks do not often provide a meaningful basis for confidently inferring membership, whereas our…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Privacy-Preserving Technologies in Data
