Circumventing Backdoor Defenses That Are Based on Latent Separability

Xiangyu Qi; Tinghao Xie; Yiming Li; Saeed Mahloujifar; Prateek Mittal

arXiv:2205.13613·cs.LG·March 7, 2023·5 cites

Circumventing Backdoor Defenses That Are Based on Latent Separability

Xiangyu Qi, Tinghao Xie, Yiming Li, Saeed Mahloujifar, Prateek Mittal

PDF

Open Access 1 Repo

TL;DR

This paper demonstrates that adaptive backdoor poisoning attacks can bypass defenses based on latent separability by diversifying latent representations, challenging the assumption that latent separation is unavoidable in such attacks.

Contribution

The authors develop adaptive attack methods that counter the latent separability assumption, revealing its limitations for backdoor defense strategies.

Findings

01

Adaptive attacks bypass existing latent separation defenses

02

High attack success rate with minimal impact on clean accuracy

03

Latent separation is not an unavoidable feature of backdoor attacks

Abstract

Recent studies revealed that deep learning is susceptible to backdoor poisoning attacks. An adversary can embed a hidden backdoor into a model to manipulate its predictions by only modifying a few training data, without controlling the training process. Currently, a tangible signature has been widely observed across a diverse set of backdoor poisoning attacks -- models trained on a poisoned dataset tend to learn separable latent representations for poison and clean samples. This latent separation is so pervasive that a family of backdoor defenses directly take it as a default assumption (dubbed latent separability assumption), based on which to identify poison samples via cluster analysis in the latent space. An intriguing question consequently follows: is the latent separation unavoidable for backdoor poisoning attacks? This question is central to understanding whether the assumption…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

unispac/circumventing-backdoor-defenses
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications · Advanced Malware Detection Techniques