Label-only Model Inversion Attack: The Attack that Requires the Least Information
Dayong Ye, Tianqing Zhu, Shuai Zhou, Bo Liu, Wanlei Zhou

TL;DR
This paper introduces a novel model inversion attack that only requires output labels, making it highly applicable and effective even when minimal information is available from the target model.
Contribution
The paper presents the first label-only model inversion attack that reconstructs data using only output labels, exploiting error rates to generate confidence scores for reconstruction.
Findings
Reconstructed data records are highly recognizable.
The attack outperforms existing methods in low-information scenarios.
Effective with only label outputs, requiring minimal model information.
Abstract
In a model inversion attack, an adversary attempts to reconstruct the data records, used to train a target model, using only the model's output. In launching a contemporary model inversion attack, the strategies discussed are generally based on either predicted confidence score vectors, i.e., black-box attacks, or the parameters of a target model, i.e., white-box attacks. However, in the real world, model owners usually only give out the predicted labels; the confidence score vectors and model parameters are hidden as a defense mechanism to prevent such attacks. Unfortunately, we have found a model inversion method that can reconstruct the input data records based only on the output labels. We believe this is the attack that requires the least information to succeed and, therefore, has the best applicability. The key idea is to exploit the error rate of the target model to compute the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning
