Universal Backdoor Attacks Detection via Adaptive Adversarial Probe
Yuhang Wang, Huafeng Shi, Rui Min, Ruijia Wu, Siyuan Liang, Yichao Wu,, Ding Liang, Aishan Liu

TL;DR
This paper introduces a universal backdoor attack detection method called Adaptive Adversarial Probe (A2P) that adaptively probes images to identify diverse backdoor triggers, outperforming existing methods across multiple datasets.
Contribution
The paper proposes a novel global-to-local probing framework with adaptive regions and attack budgets to detect diverse backdoor triggers in neural networks.
Findings
A2P outperforms state-of-the-art methods by large margins (+12%) on multiple datasets.
The adaptive probing strategy effectively detects diverse backdoor triggers.
Extensive experiments validate the robustness and effectiveness of A2P.
Abstract
Extensive evidence has demonstrated that deep neural networks (DNNs) are vulnerable to backdoor attacks, which motivates the development of backdoor attacks detection. Most detection methods are designed to verify whether a model is infected with presumed types of backdoor attacks, yet the adversary is likely to generate diverse backdoor attacks in practice that are unforeseen to defenders, which challenge current detection strategies. In this paper, we focus on this more challenging scenario and propose a universal backdoor attacks detection method named Adaptive Adversarial Probe (A2P). Specifically, we posit that the challenge of universal backdoor attacks detection lies in the fact that different backdoor attacks often exhibit diverse characteristics in trigger patterns (i.e., sizes and transparencies). Therefore, our A2P adopts a global-to-local probing framework, which…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications · Domain Adaptation and Few-Shot Learning
