Spectral Signatures in Backdoor Attacks

Brandon Tran; Jerry Li; Aleksander Madry

arXiv:1811.00636·cs.LG·November 5, 2018·237 cites

Spectral Signatures in Backdoor Attacks

Brandon Tran, Jerry Li, Aleksander Madry

PDF

Open Access 1 Repo

TL;DR

This paper identifies spectral signatures as a common property of backdoor attacks, enabling detection and removal of poisoned data in neural networks to improve security.

Contribution

It introduces the concept of spectral signatures in backdoor attacks and demonstrates their use in detecting and mitigating such attacks in real-world neural networks.

Findings

01

Spectral signatures are present in all known backdoor attacks.

02

Tools from robust statistics can effectively detect poisoned data.

03

Detection and removal improve neural network security against backdoors.

Abstract

A recent line of work has uncovered a new form of data poisoning: so-called \emph{backdoor} attacks. These attacks are particularly dangerous because they do not affect a network's behavior on typical, benign data. Rather, the network only deviates from its expected output when triggered by a perturbation planted by an adversary. In this paper, we identify a new property of all known backdoor attacks, which we call \emph{spectral signatures}. This property allows us to utilize tools from robust statistics to thwart the attacks. We demonstrate the efficacy of these signatures in detecting and removing poisoned examples on real image sets and state of the art neural network architectures. We believe that understanding spectral signatures is a crucial first step towards designing ML systems secure against such backdoor attacks

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

bxz9200/ultraclean
pytorch

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Advanced Malware Detection Techniques · Cryptographic Implementations and Security