Attribute-to-Delete: Machine Unlearning via Datamodel Matching

Kristian Georgiev; Roy Rinberg; Sung Min Park; Shivam Garg; Andrew; Ilyas; Aleksander Madry; Seth Neel

arXiv:2410.23232·cs.LG·November 13, 2024

Attribute-to-Delete: Machine Unlearning via Datamodel Matching

Kristian Georgiev, Roy Rinberg, Sung Min Park, Shivam Garg, Andrew, Ilyas, Aleksander Madry, Seth Neel

PDF

Open Access 3 Datasets

TL;DR

This paper introduces Datamodel Matching, a new machine unlearning method that leverages data attribution to produce models indistinguishable from retrained models, showing strong empirical results especially in non-convex settings.

Contribution

The paper proposes a novel meta-algorithm, Datamodel Matching, which reduces unlearning to data attribution and demonstrates improved performance over existing methods.

Findings

01

DMM outperforms existing unlearning algorithms in convex settings.

02

DMM achieves strong unlearning performance in non-convex models.

03

Future data attribution advances can enhance unlearning techniques.

Abstract

Machine unlearning -- efficiently removing the effect of a small "forget set" of training data on a pre-trained machine learning model -- has recently attracted significant research interest. Despite this interest, however, recent work shows that existing machine unlearning techniques do not hold up to thorough evaluation in non-convex settings. In this work, we introduce a new machine unlearning technique that exhibits strong empirical performance even in such challenging settings. Our starting point is the perspective that the goal of unlearning is to produce a model whose outputs are statistically indistinguishable from those of a model re-trained on all but the forget set. This perspective naturally suggests a reduction from the unlearning problem to that of data attribution, where the goal is to predict the effect of changing the training set on a model's outputs. Thus motivated,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Datasets

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Data Classification

MethodsSparse Evolutionary Training