Defending Model Inversion and Membership Inference Attacks via Prediction Purification
Ziqi Yang, Bin Shao, Bohan Xuan, Ee-Chien Chang, Fan Zhang

TL;DR
This paper introduces a purification framework that reduces confidence score dispersion to defend neural networks against model inversion and membership inference attacks, effectively lowering attack success rates with minimal impact on model accuracy.
Contribution
A unified purification approach is proposed to defend against data inference attacks, with specialization for specific attacks via adversarial learning, demonstrating effectiveness and connection between attack types.
Findings
Reduces membership inference accuracy by up to 15%.
Increases model inversion error by up to 4 times.
Incurs less than 0.4% accuracy drop and 5.5% confidence score distortion.
Abstract
Neural networks are susceptible to data inference attacks such as the model inversion attack and the membership inference attack, where the attacker could infer the reconstruction and the membership of a data sample from the confidence scores predicted by the target classifier. In this paper, we propose a unified approach, namely purification framework, to defend data inference attacks. It purifies the confidence score vectors predicted by the target classifier by reducing their dispersion. The purifier can be further specialized in defending a particular attack via adversarial learning. We evaluate our approach on benchmark datasets and classifiers. We show that when the purifier is dedicated to one attack, it naturally defends the other one, which empirically demonstrates the connection between the two attacks. The purifier can effectively defend both attacks. For example, it can…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Privacy-Preserving Technologies in Data · Autopsy Techniques and Outcomes
