Label-Only Model Inversion Attacks via Knowledge Transfer
Ngoc-Bao Nguyen, Keshigeyan Chandrasegaran, Milad Abdollahzadeh,, Ngai-Man Cheung

TL;DR
This paper introduces LOKT, a novel method for label-only model inversion attacks that leverages knowledge transfer to surrogate models, significantly improving attack success in minimal-information scenarios.
Contribution
The paper proposes a new knowledge transfer approach using generative models and introduces T-ACGAN for effective label-only MI attacks, transforming the problem into a white-box setting.
Findings
Outperforms state-of-the-art label-only MI attacks by over 15%
Effectively uses surrogate models for privacy breach in minimal information settings
Reduces query budget compared to existing methods
Abstract
In a model inversion (MI) attack, an adversary abuses access to a machine learning (ML) model to infer and reconstruct private training data. Remarkable progress has been made in the white-box and black-box setups, where the adversary has access to the complete model or the model's soft output respectively. However, there is very limited study in the most challenging but practically important setup: Label-only MI attacks, where the adversary only has access to the model's predicted label (hard label) without confidence scores nor any other model information. In this work, we propose LOKT, a novel approach for label-only MI attacks. Our idea is based on transfer of knowledge from the opaque target model to surrogate models. Subsequently, using these surrogate models, our approach can harness advanced white-box attacks. We propose knowledge transfer based on generative modelling, and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Privacy-Preserving Technologies in Data · Topic Modeling
