Data Poisoning Won't Save You From Facial Recognition
Evani Radiya-Dixit, Sanghyun Hong, Nicholas Carlini, Florian Tram\`er

TL;DR
This paper critically examines the effectiveness of data poisoning defenses against facial recognition, revealing that such methods are fundamentally limited due to the adaptive nature of models and future technological developments.
Contribution
The paper demonstrates that data poisoning strategies like Fawkes and LowKey are ineffective against adaptive models and future advancements, challenging their use as reliable privacy defenses.
Findings
Poisoned images can be nullified by future model updates.
Adversaries can train robust models to resist poisoning.
Poisoned images can be detected online.
Abstract
Data poisoning has been proposed as a compelling defense against facial recognition models trained on Web-scraped pictures. Users can perturb images they post online, so that models will misclassify future (unperturbed) pictures. We demonstrate that this strategy provides a false sense of security, as it ignores an inherent asymmetry between the parties: users' pictures are perturbed once and for all before being published (at which point they are scraped) and must thereafter fool all future models -- including models trained adaptively against the users' past attacks, or models that use technologies discovered after the attack. We evaluate two systems for poisoning attacks against large-scale facial recognition, Fawkes (500'000+ downloads) and LowKey. We demonstrate how an "oblivious" model trainer can simply wait for future developments in computer vision to nullify the protection of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Advanced Malware Detection Techniques · Digital Media Forensic Detection
MethodsFawkes
