Handcrafted Backdoors in Deep Neural Networks
Sanghyun Hong, Nicholas Carlini, Alexey Kurakin

TL;DR
This paper introduces a novel handcrafted backdoor attack method for deep neural networks that directly manipulates model weights, demonstrating high success rates and evasion of defenses across multiple datasets and architectures.
Contribution
It presents a new attack technique that extends beyond poisoning by directly modifying model weights, highlighting a broader threat landscape in supply-chain attacks.
Findings
Achieves over 96% attack success rate across datasets and architectures
Effectively evades many existing backdoor detection and removal defenses
Highlights the need for further research into supply-chain backdoor vulnerabilities
Abstract
When machine learning training is outsourced to third parties, become practical as the third party who trains the model may act maliciously to inject hidden behaviors into the otherwise accurate model. Until now, the mechanism to inject backdoors has been limited to . We argue that a supply-chain attacker has more attack techniques available by introducing a attack that directly manipulates a model's weights. This direct modification gives our attacker more degrees of freedom compared to poisoning, and we show it can be used to evade many backdoor detection or removal defenses effectively. Across four datasets and four network architectures our backdoor attacks maintain an attack success rate above 96%. Our results suggest that further research is needed for understanding the complete space of supply-chain backdoor attacks.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications · Network Security and Intrusion Detection
