To Trust or Not To Trust Prediction Scores for Membership Inference Attacks
Dominik Hintersdorf, Lukas Struppek, Kristian Kersting

TL;DR
This paper challenges the effectiveness of prediction-score-based membership inference attacks on modern deep networks, showing that overconfidence and model uncertainty can mitigate privacy risks more than previously thought.
Contribution
It demonstrates that overconfidence in deep models reduces MIA success and introduces a generative approach to generate false positives, questioning the assumed threat level of MIAs.
Findings
Overconfidence leads to high false positives in MIAs.
Generative adversarial networks can produce samples falsely classified as training data.
Trade-off exists between model confidence and susceptibility to MIAs.
Abstract
Membership inference attacks (MIAs) aim to determine whether a specific sample was used to train a predictive model. Knowing this may indeed lead to a privacy breach. Most MIAs, however, make use of the model's prediction scores - the probability of each output given some input - following the intuition that the trained model tends to behave differently on its training data. We argue that this is a fallacy for many modern deep network architectures. Consequently, MIAs will miserably fail since overconfidence leads to high false-positive rates not only on known domains but also on out-of-distribution data and implicitly acts as a defense against MIAs. Specifically, using generative adversarial networks, we are able to produce a potentially infinite number of samples falsely classified as part of the training data. In other words, the threat of MIAs is overestimated, and less information…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Privacy-Preserving Technologies in Data
