The Sample Complexity of Membership Inference and Privacy Auditing
Mahdi Haghifam, Adam Smith, Jonathan Ullman

TL;DR
This paper investigates the minimum number of reference samples needed for successful membership inference attacks, revealing that attackers sometimes require significantly more data than the training set, impacting privacy assessments.
Contribution
It provides the first analysis of sample complexity for membership inference in Gaussian models, showing the potential for more powerful attacks with additional distribution knowledge.
Findings
Attacker may need more samples than training data for successful inference.
Current practical attacks are limited to O(n) samples, possibly underestimating risks.
More information about the data distribution enables stronger membership inference attacks.
Abstract
A membership-inference attack gets the output of a learning algorithm, and a target individual, and tries to determine whether this individual is a member of the training data or an independent sample from the same distribution. A successful membership-inference attack typically requires the attacker to have some knowledge about the distribution that the training data was sampled from, and this knowledge is often captured through a set of independent reference samples from that distribution. In this work we study how much information the attacker needs for membership inference by investigating the sample complexity-the minimum number of reference samples required-for a successful attack. We study this question in the fundamental setting of Gaussian mean estimation where the learning algorithm is given samples from a Gaussian distribution in dimensions,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
