CEPA: Consensus Embedded Perturbation for Agnostic Detection and   Inversion of Backdoors

Guangmingmei Yang; Xi Li; Hang Wang; David J. Miller; George; Kesidis

arXiv:2402.02034·cs.CR·March 10, 2025·1 cites

CEPA: Consensus Embedded Perturbation for Agnostic Detection and Inversion of Backdoors

Guangmingmei Yang, Xi Li, Hang Wang, David J. Miller, George, Kesidis

PDF

Open Access

TL;DR

This paper introduces CEPA, a backdoor detection method that uses embedded feature representations to identify and invert backdoors in neural networks, effective across various attack mechanisms without needing training data.

Contribution

CEPA is a novel, backdoor-agnostic detection approach that operates without training data and effectively handles multiple backdoor incorporation methods.

Findings

01

CEPA outperforms existing defenses on CIFAR-10 and CIFAR-100 datasets.

02

It effectively detects and inverts backdoors across different attack mechanisms.

03

CEPA does not require access to the training dataset.

Abstract

A variety of defenses have been proposed against Trojans planted in (backdoor attacks on) deep neural network (DNN) classifiers. Backdoor-agnostic methods seek to reliably detect and/or to mitigate backdoors irrespective of the incorporation mechanism used by the attacker, while inversion methods explicitly assume one. In this paper, we describe a new detector that: relies on embedded feature representations to estimate (invert) the backdoor and to identify its target class; can operate without access to the training dataset; and is highly effective for various incorporation mechanisms (i.e., is backdoor agnostic). Our detection approach is evaluated -- and found to be favorable - in comparison with an array of published defenses for a variety of different attacks on the CIFAR-10 and CIFAR-100 image-classification domains.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning