TL;DR
This paper introduces a novel method called L-CODEC for deep unlearning that efficiently removes specific training data from complex models without matrix inversion, applicable to vision and NLP tasks.
Contribution
The paper proposes L-CODEC, a new approach using conditional independence to enable scalable deep unlearning without Hessian inversion, applicable to large models.
Findings
L-CODEC enables approximate unlearning in high-dimensional models.
The method is effective for vision and NLP models.
It avoids computationally expensive matrix inversions.
Abstract
Recent legislation has led to interest in machine unlearning, i.e., removing specific training samples from a predictive model as if they never existed in the training dataset. Unlearning may also be required due to corrupted/adversarial data or simply a user's updated privacy requirement. For models which require no training (k-NN), simply deleting the closest original sample can be effective. But this idea is inapplicable to models which learn richer representations. Recent ideas leveraging optimization-based updates scale poorly with the model dimension d, due to inverting the Hessian of the loss function. We use a variant of a new conditional independence coefficient, L-CODEC, to identify a subset of the model parameters with the most semantic overlap on an individual sample level. Our approach completely avoids the need to invert a (possibly) huge matrix. By utilizing a Markov…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
