Universal Backdoor Attacks Detection via Adaptive Adversarial Probe

Yuhang Wang; Huafeng Shi; Rui Min; Ruijia Wu; Siyuan Liang; Yichao Wu,; Ding Liang; Aishan Liu

arXiv:2209.05244·cs.CV·December 8, 2022

Universal Backdoor Attacks Detection via Adaptive Adversarial Probe

Yuhang Wang, Huafeng Shi, Rui Min, Ruijia Wu, Siyuan Liang, Yichao Wu,, Ding Liang, Aishan Liu

PDF

Open Access

TL;DR

This paper introduces a universal backdoor attack detection method called Adaptive Adversarial Probe (A2P) that adaptively probes images to identify diverse backdoor triggers, outperforming existing methods across multiple datasets.

Contribution

The paper proposes a novel global-to-local probing framework with adaptive regions and attack budgets to detect diverse backdoor triggers in neural networks.

Findings

01

A2P outperforms state-of-the-art methods by large margins (+12%) on multiple datasets.

02

The adaptive probing strategy effectively detects diverse backdoor triggers.

03

Extensive experiments validate the robustness and effectiveness of A2P.

Abstract

Extensive evidence has demonstrated that deep neural networks (DNNs) are vulnerable to backdoor attacks, which motivates the development of backdoor attacks detection. Most detection methods are designed to verify whether a model is infected with presumed types of backdoor attacks, yet the adversary is likely to generate diverse backdoor attacks in practice that are unforeseen to defenders, which challenge current detection strategies. In this paper, we focus on this more challenging scenario and propose a universal backdoor attacks detection method named Adaptive Adversarial Probe (A2P). Specifically, we posit that the challenge of universal backdoor attacks detection lies in the fact that different backdoor attacks often exhibit diverse characteristics in trigger patterns (i.e., sizes and transparencies). Therefore, our A2P adopts a global-to-local probing framework, which…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications · Domain Adaptation and Few-Shot Learning