Witches' Brew: Industrial Scale Data Poisoning via Gradient Matching

Jonas Geiping; Liam Fowl; W. Ronny Huang; Wojciech Czaja; Gavin; Taylor; Michael Moeller; Tom Goldstein

arXiv:2009.02276·cs.CV·May 11, 2021·36 cites

Witches' Brew: Industrial Scale Data Poisoning via Gradient Matching

Jonas Geiping, Liam Fowl, W. Ronny Huang, Wojciech Czaja, Gavin, Taylor, Michael Moeller, Tom Goldstein

PDF

Open Access 2 Repos 1 Video

TL;DR

This paper introduces a scalable, targeted data poisoning attack using gradient matching that can cause misclassification in large, modern deep neural networks trained from scratch, highlighting a significant security threat.

Contribution

It presents the first effective large-scale poisoning method that works on full-sized datasets like ImageNet, demonstrating vulnerabilities in current defenses.

Findings

01

First successful targeted poisoning attack on ImageNet from scratch

02

Attack remains nearly imperceptible and effective against modern models

03

Existing defenses are insufficient against this threat

Abstract

Data Poisoning attacks modify training data to maliciously control a model trained on such data. In this work, we focus on targeted poisoning attacks which cause a reclassification of an unmodified test image and as such breach model integrity. We consider a particularly malicious poisoning attack that is both "from scratch" and "clean label", meaning we analyze an attack that successfully works against new, randomly initialized models, and is nearly imperceptible to humans, all while perturbing only a small fraction of the training data. Previous poisoning attacks against deep neural networks in this setting have been limited in scope and success, working only in simplified settings or being prohibitively expensive for large datasets. The central mechanism of the new attack is matching the gradient direction of malicious examples. We analyze why this works, supplement with practical…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

Witches' Brew: Industrial Scale Data Poisoning via Gradient Matching· slideslive

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications · Advanced Neural Network Applications