INK: Inheritable Natural Backdoor Attack Against Model Distillation
Xiaolei Liu, Ming Yi, Kangyi Ding, Bangzhou Xin, Yixiao Xu, Li Yan,, Chao Shen

TL;DR
INK introduces a novel backdoor attack leveraging natural dataset features, enabling stealthy, high-success attacks against model distillation even with defenses, by using image variance as a trigger.
Contribution
The paper presents INK, a natural backdoor attack that exploits inherent dataset features to bypass defenses during model distillation, a novel approach in backdoor attack research.
Findings
Achieves over 98% attack success rate post-distillation.
Maintains robustness against various defense strategies.
Outperforms existing methods with a success rate of 1.4%.
Abstract
Deep learning models are vulnerable to backdoor attacks, where attackers inject malicious behavior through data poisoning and later exploit triggers to manipulate deployed models. To improve the stealth and effectiveness of backdoors, prior studies have introduced various imperceptible attack methods targeting both defense mechanisms and manual inspection. However, all poisoning-based attacks still rely on privileged access to the training dataset. Consequently, model distillation using a trusted dataset has emerged as an effective defense against these attacks. To bridge this gap, we introduce INK, an inheritable natural backdoor attack that targets model distillation. The key insight behind INK is the use of naturally occurring statistical features in all datasets, allowing attackers to leverage them as backdoor triggers without direct access to the training data. Specifically, INK…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRadiation Effects in Electronics · Cryptographic Implementations and Security · Formal Methods in Verification
