Beating Backdoor Attack at Its Own Game
Min Liu, Alberto Sangiovanni-Vincentelli, Xiangyu Yue

TL;DR
This paper introduces a novel defense framework against backdoor attacks on deep neural networks, using non-adversarial backdoors to suppress malicious triggers while maintaining high accuracy on clean data.
Contribution
The paper proposes a simple, effective defense method that injects non-adversarial backdoors during data preprocessing, achieving state-of-the-art results without modifying training pipelines.
Findings
Achieves lowest performance drop on clean data
Demonstrates effectiveness across multiple benchmarks and architectures
Outperforms existing defense methods in attack success rate reduction
Abstract
Deep neural networks (DNNs) are vulnerable to backdoor attack, which does not affect the network's performance on clean data but would manipulate the network behavior once a trigger pattern is added. Existing defense methods have greatly reduced attack success rate, but their prediction accuracy on clean data still lags behind a clean model by a large margin. Inspired by the stealthiness and effectiveness of backdoor attack, we propose a simple but highly effective defense framework which injects non-adversarial backdoors targeting poisoned samples. Following the general steps in backdoor attack, we detect a small set of suspected samples and then apply a poisoning strategy to them. The non-adversarial backdoor, once triggered, suppresses the attacker's backdoor on poisoned data, but has limited influence on clean data. The defense can be carried out during data preprocessing, without…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Beating Backdoor Attack at Its Own Game· youtube
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications · Cardiac Arrest and Resuscitation
