Detecting Backdoors During the Inference Stage Based on Corruption   Robustness Consistency

Xiaogeng Liu; Minghui Li; Haoyu Wang; Shengshan Hu; Dengpan Ye; Hai; Jin; Libing Wu; Chaowei Xiao

arXiv:2303.18191·cs.CR·April 3, 2023·1 cites

Detecting Backdoors During the Inference Stage Based on Corruption Robustness Consistency

Xiaogeng Liu, Minghui Li, Haoyu Wang, Shengshan Hu, Dengpan Ye, Hai, Jin, Libing Wu, Chaowei Xiao

PDF

Open Access 1 Repo

TL;DR

This paper introduces TeCo, a novel test-time detection method for backdoor triggers in neural networks that relies solely on model outputs and exploits differences in corruption robustness between clean and triggered samples.

Contribution

TeCo is a new detection approach that requires no extra data or trigger knowledge, using corruption robustness consistency to identify backdoor trigger samples during inference.

Findings

01

TeCo outperforms existing methods with higher AUROC scores.

02

It demonstrates robustness across various datasets, attacks, and models.

03

TeCo achieves five times greater stability than state-of-the-art defenses.

Abstract

Deep neural networks are proven to be vulnerable to backdoor attacks. Detecting the trigger samples during the inference stage, i.e., the test-time trigger sample detection, can prevent the backdoor from being triggered. However, existing detection methods often require the defenders to have high accessibility to victim models, extra clean data, or knowledge about the appearance of backdoor triggers, limiting their practicality. In this paper, we propose the test-time corruption robustness consistency evaluation (TeCo), a novel test-time trigger sample detection method that only needs the hard-label outputs of the victim models without any extra information. Our journey begins with the intriguing observation that the backdoor-infected models have similar performance across different image corruptions for the clean images, but perform discrepantly for the trigger samples. Based on this…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

cgcl-codes/teco
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications · Advanced Neural Network Applications