The Utility and Complexity of in- and out-of-Distribution Machine Unlearning
Youssef Allouah, Joshua Kazdan, Rachid Guerraoui, Sanmi Koyejo

TL;DR
This paper analyzes the utility, complexity, and guarantees of machine unlearning, proposing methods for in-distribution data removal and a new algorithm for out-of-distribution data, balancing privacy, utility, and efficiency.
Contribution
It provides a formal analysis of unlearning trade-offs, introduces a simple method for in-distribution data, and proposes a robust algorithm for out-of-distribution unlearning.
Findings
Empirical risk minimization with output perturbation achieves optimal trade-offs for in-distribution data.
Out-of-distribution unlearning can require more time than retraining, even for single samples.
A new gradient descent variant can amortize unlearning time without losing utility.
Abstract
Machine unlearning, the process of selectively removing data from trained models, is increasingly crucial for addressing privacy concerns and knowledge gaps post-deployment. Despite this importance, existing approaches are often heuristic and lack formal guarantees. In this paper, we analyze the fundamental utility, time, and space complexity trade-offs of approximate unlearning, providing rigorous certification analogous to differential privacy. For in-distribution forget data -- data similar to the retain set -- we show that a surprisingly simple and general procedure, empirical risk minimization with output perturbation, achieves tight unlearning-utility-complexity trade-offs, addressing a previous theoretical gap on the separation from unlearning "for free" via differential privacy, which inherently facilitates the removal of such data. However, such techniques fail with…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsBelt Conveyor Systems Engineering · Industrial Automation and Control Systems · Smart Grid Energy Management
MethodsSparse Evolutionary Training
