Attribute-to-Delete: Machine Unlearning via Datamodel Matching
Kristian Georgiev, Roy Rinberg, Sung Min Park, Shivam Garg, Andrew, Ilyas, Aleksander Madry, Seth Neel

TL;DR
This paper introduces Datamodel Matching, a new machine unlearning method that leverages data attribution to produce models indistinguishable from retrained models, showing strong empirical results especially in non-convex settings.
Contribution
The paper proposes a novel meta-algorithm, Datamodel Matching, which reduces unlearning to data attribution and demonstrates improved performance over existing methods.
Findings
DMM outperforms existing unlearning algorithms in convex settings.
DMM achieves strong unlearning performance in non-convex models.
Future data attribution advances can enhance unlearning techniques.
Abstract
Machine unlearning -- efficiently removing the effect of a small "forget set" of training data on a pre-trained machine learning model -- has recently attracted significant research interest. Despite this interest, however, recent work shows that existing machine unlearning techniques do not hold up to thorough evaluation in non-convex settings. In this work, we introduce a new machine unlearning technique that exhibits strong empirical performance even in such challenging settings. Our starting point is the perspective that the goal of unlearning is to produce a model whose outputs are statistically indistinguishable from those of a model re-trained on all but the forget set. This perspective naturally suggests a reduction from the unlearning problem to that of data attribution, where the goal is to predict the effect of changing the training set on a model's outputs. Thus motivated,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification
MethodsSparse Evolutionary Training
