Representation Unlearning: Forgetting through Information Compression
Antonio Almud\'evar, Alfonso Ortega

TL;DR
This paper introduces Representation Unlearning, a novel method that removes specific data influence from models by transforming their representations, leading to more reliable forgetting and improved efficiency compared to traditional parameter-based approaches.
Contribution
It proposes a new framework for unlearning directly in the representation space, using information bottlenecks and variational surrogates, applicable in both data-available and zero-shot scenarios.
Findings
Achieves more reliable forgetting than baselines.
Maintains better utility of the model after unlearning.
Demonstrates greater computational efficiency.
Abstract
Machine unlearning seeks to remove the influence of specific training data from a model, a need driven by privacy regulations and robustness concerns. Existing approaches typically modify model parameters, but such updates can be unstable, computationally costly, and limited by local approximations. We introduce Representation Unlearning, a framework that performs unlearning directly in the model's representation space. Instead of modifying model parameters, we learn a transformation over representations that imposes an information bottleneck: maximizing mutual information with retained data while suppressing information about data to be forgotten. We derive variational surrogates that make this objective tractable and show how they can be instantiated in two practical regimes: when both retain and forget data are available, and in a zero-shot setting where only forget data can be…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Domain Adaptation and Few-Shot Learning · Adversarial Robustness in Machine Learning
