Transferable Clean-Label Poisoning Attacks on Deep Neural Nets

Chen Zhu; W. Ronny Huang; Ali Shafahi; Hengduo Li; Gavin Taylor,; Christoph Studer; Tom Goldstein

arXiv:1905.05897·stat.ML·May 17, 2019·136 cites

Transferable Clean-Label Poisoning Attacks on Deep Neural Nets

Chen Zhu, W. Ronny Huang, Ali Shafahi, Hengduo Li, Gavin Taylor,, Christoph Studer, Tom Goldstein

PDF

Open Access 1 Repo

TL;DR

This paper introduces a transferable clean-label poisoning attack on deep neural networks that surrounds a target in feature space, achieving over 50% success with minimal data poisoning.

Contribution

The authors propose a novel polytope attack method and show that Dropout enhances transferability, enabling effective attacks without access to victim model details.

Findings

01

Achieves over 50% attack success rate.

02

Poisons only 1% of training data.

03

Dropout improves attack transferability.

Abstract

Clean-label poisoning attacks inject innocuous looking (and "correctly" labeled) poison images into training data, causing a model to misclassify a targeted image after being trained on this data. We consider transferable poisoning attacks that succeed without access to the victim network's outputs, architecture, or (in some cases) training data. To achieve this, we propose a new "polytope attack" in which poison images are designed to surround the targeted image in feature space. We also demonstrate that using Dropout during poison creation helps to enhance transferability of this attack. We achieve transferable attack success rates of over 50% while poisoning only 1% of the training set.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

zhuchen03/ConvexPolytopePosioning
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning

MethodsDropout