Machine Unlearning via Information Theoretic Regularization
Shizhou Xu, Thomas Strohmer

TL;DR
This paper introduces an information-theoretic regularization framework for machine unlearning, enabling effective removal of data points or features with formal guarantees and broad applicability.
Contribution
It proposes a unified mathematical framework for data point and feature unlearning, with formal definitions, guarantees, and solutions applicable to deep learning models.
Findings
Introduces the Marginal Unlearning Principle inspired by neuroscience.
Provides formal information-theoretic unlearning definitions and guarantees.
Offers a practical, adaptable approach for feature unlearning in deep learning.
Abstract
How can we effectively remove or ''unlearn'' undesirable information, such as specific features or the influence of individual data points, from a learning outcome while minimizing utility loss and ensuring rigorous guarantees? We introduce a unified mathematical framework based on information-theoretic regularization to address both data point unlearning and feature unlearning. For data point unlearning, we introduce the , an auditable and provable framework inspired by memory suppression studies in neuroscience. Moreover, we provide formal information-theoretic unlearning definition based on the proposed principle, named marginal unlearning, and provable guarantees on sufficiency and necessity of marginal unlearning to the existing approximate unlearning definitions. We then show the proposed framework provide natural solution to the marginal…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFault Detection and Control Systems · Neural Networks and Applications
