Trap-MID: Trapdoor-based Defense against Model Inversion Attacks
Zhen-Ting Liu, Shang-Tse Chen

TL;DR
Trap-MID introduces a trapdoor mechanism into neural networks to effectively mislead model inversion attacks, providing a novel defense that outperforms existing methods without extra data or high computational costs.
Contribution
The paper proposes a novel trapdoor-based defense mechanism against MI attacks, with theoretical analysis and empirical validation demonstrating its effectiveness.
Findings
Achieves state-of-the-art defense performance against MI attacks
Does not require extra data or significant computational overhead
Effectively misleads MI attacks by using trapdoor triggers
Abstract
Model Inversion (MI) attacks pose a significant threat to the privacy of Deep Neural Networks by recovering training data distribution from well-trained models. While existing defenses often rely on regularization techniques to reduce information leakage, they remain vulnerable to recent attacks. In this paper, we propose the Trapdoor-based Model Inversion Defense (Trap-MID) to mislead MI attacks. A trapdoor is integrated into the model to predict a specific label when the input is injected with the corresponding trigger. Consequently, this trapdoor information serves as the "shortcut" for MI attacks, leading them to extract trapdoor triggers rather than private data. We provide theoretical insights into the impacts of trapdoor's effectiveness and naturalness on deceiving MI attacks. In addition, empirical experiments demonstrate the state-of-the-art defense performance of Trap-MID…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Security and Verification in Computing · Cryptography and Data Security
