Post-Training Detection of Backdoor Attacks for Two-Class and   Multi-Attack Scenarios

Zhen Xiang; David J. Miller; George Kesidis

arXiv:2201.08474·cs.CR·March 15, 2022·6 cites

Post-Training Detection of Backdoor Attacks for Two-Class and Multi-Attack Scenarios

Zhen Xiang, David J. Miller, George Kesidis

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces a novel backdoor attack detection framework that effectively identifies attacks in two-class and multi-attack scenarios without access to training data or clean classifiers, using reverse-engineering and an expected transferability statistic.

Contribution

The paper proposes a new detection framework based on backdoor pattern reverse-engineering and a novel expected transferability statistic, applicable to two-class and multi-attack scenarios without training data.

Findings

01

Effective detection across six benchmark datasets

02

Applicable to multi-class and multi-attack scenarios

03

Threshold-independent detection performance

Abstract

Backdoor attacks (BAs) are an emerging threat to deep neural network classifiers. A victim classifier will predict to an attacker-desired target class whenever a test sample is embedded with the same backdoor pattern (BP) that was used to poison the classifier's training set. Detecting whether a classifier is backdoor attacked is not easy in practice, especially when the defender is, e.g., a downstream user without access to the classifier's training set. This challenge is addressed here by a reverse-engineering defense (RED), which has been shown to yield state-of-the-art performance in several domains. However, existing REDs are not applicable when there are only {\it two classes} or when {\it multiple attacks} are present. These scenarios are first studied in the current paper, under the practical constraints that the defender neither has access to the classifier's training set nor…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

zhenxianglance/2classbadetection
pytorchOfficial

Videos

Post-Training Detection of Backdoor Attacks for Two-Class and Multi-Attack Scenarios· slideslive

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications · Network Security and Intrusion Detection