TL;DR
This paper introduces a novel GAN-based approach for model inversion attacks that leverages knowledge distillation and class distribution modeling, significantly improving attack success rates on private models.
Contribution
It presents a new inversion-specific GAN that incorporates soft-labels and class distribution modeling, enhancing the effectiveness of MI attacks on deep neural networks.
Findings
Boosts attack success rate by 150% over previous methods
Generalizes well across various datasets and models
Uses a discriminator that differentiates real, fake, and soft-label data
Abstract
Model inversion (MI) attacks are aimed at reconstructing training data from model parameters. Such attacks have triggered increasing concerns about privacy, especially given a growing number of online model repositories. However, existing MI attacks against deep neural networks (DNNs) have large room for performance improvement. We present a novel inversion-specific GAN that can better distill knowledge useful for performing attacks on private models from public data. In particular, we train the discriminator to differentiate not only the real and fake samples but the soft-labels provided by the target model. Moreover, unlike previous work that directly searches for a single data point to represent a target class, we propose to model a private data distribution for each target class. Our experiments show that the combination of these techniques can significantly boost the success rate…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
