Boosting Model Inversion Attacks with Adversarial Examples
Shuai Zhou, Tianqing Zhu, Dayong Ye, Xin Yu, and Wanlei Zhou

TL;DR
This paper introduces a novel training paradigm for learning-based model inversion attacks that uses semantic loss and adversarial examples to significantly improve attack accuracy and data diversity in black-box settings.
Contribution
It proposes a new training scheme combining semantic regularization and adversarial example injection to enhance the effectiveness of learning-based model inversion attacks.
Findings
Boosts attack accuracy even without extra queries.
Reconstructs more diverse and class-representative data.
Highlights increased threat level of learning-based attacks.
Abstract
Model inversion attacks involve reconstructing the training data of a target model, which raises serious privacy concerns for machine learning models. However, these attacks, especially learning-based methods, are likely to suffer from low attack accuracy, i.e., low classification accuracy of these reconstructed data by machine learning classifiers. Recent studies showed an alternative strategy of model inversion attacks, GAN-based optimization, can improve the attack accuracy effectively. However, these series of GAN-based attacks reconstruct only class-representative training data for a class, whereas learning-based attacks can reconstruct diverse data for different training data in each class. Hence, in this paper, we propose a new training paradigm for a learning-based model inversion attack that can achieve higher attack accuracy in a black-box setting. First, we regularize the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Forensic and Genetic Research
