TL;DR
MemGuard introduces a novel defense against black-box membership inference attacks by adding carefully crafted adversarial noise to confidence scores, effectively misleading attackers while maintaining utility.
Contribution
It is the first method to use adversarial examples as a defense mechanism against membership inference attacks with formal utility-loss guarantees.
Findings
MemGuard effectively defends against inference attacks on three datasets.
It achieves better privacy-utility tradeoffs than existing methods.
Adversarial examples can be used as a novel defense strategy.
Abstract
In a membership inference attack, an attacker aims to infer whether a data sample is in a target classifier's training dataset or not. Specifically, given a black-box access to the target classifier, the attacker trains a binary classifier, which takes a data sample's confidence score vector predicted by the target classifier as an input and predicts the data sample to be a member or non-member of the target classifier's training dataset. Membership inference attacks pose severe privacy and security threats to the training dataset. Most existing defenses leverage differential privacy when training the target classifier or regularize the training process of the target classifier. These defenses suffer from two key limitations: 1) they do not have formal utility-loss guarantees of the confidence score vectors, and 2) they achieve suboptimal privacy-utility tradeoffs. In this work, we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
