Hard to Forget: Poisoning Attacks on Certified Machine Unlearning
Neil G. Marchant, Benjamin I. P. Rubinstein, Scott Alfeld

TL;DR
This paper reveals a new poisoning attack on certified machine unlearning, where malicious training data can force costly retraining, exposing vulnerabilities in privacy-preserving data removal methods.
Contribution
It introduces a novel poisoning attack that exploits the unlearning process, demonstrating how attackers can increase computational costs for data removal.
Findings
Poisoning attacks can trigger full retraining, increasing computational costs.
Current formal guarantees do not prevent such attacks.
Empirical results confirm attack effectiveness.
Abstract
The right to erasure requires removal of a user's information from data held by organizations, with rigorous interpretations extending to downstream products such as learned models. Retraining from scratch with the particular user's data omitted fully removes its influence on the resulting model, but comes with a high computational cost. Machine "unlearning" mitigates the cost incurred by full retraining: instead, models are updated incrementally, possibly only requiring retraining when approximation errors accumulate. Rapid progress has been made towards privacy guarantees on the indistinguishability of unlearned and retrained models, but current formalisms do not place practical bounds on computation. In this paper we demonstrate how an attacker can exploit this oversight, highlighting a novel attack surface introduced by machine unlearning. We consider an attacker aiming to increase…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Cryptography and Data Security · Adversarial Robustness in Machine Learning
