Reference-Guided Machine Unlearning
Jonas Mirlach, Sonia Laguna, Julia E. Vogt

TL;DR
This paper introduces Reference-Guided Unlearning (ReGUn), a novel framework that improves machine unlearning by using a disjoint dataset as a reference to better balance forgetting specific data and maintaining model utility.
Contribution
ReGUn is a new unlearning method that leverages a disjoint dataset for principled, class-conditioned distillation, outperforming existing heuristics across various models and datasets.
Findings
ReGUn achieves better forgetting-utility trade-offs than baselines.
ReGUn is effective across different model architectures and datasets.
ReGUn maintains model performance while removing specific data influence.
Abstract
Machine unlearning aims to remove the influence of specific data from trained models while preserving general utility. Existing approximate unlearning methods often rely on performance-degradation heuristics, such as loss maximization or random labeling. However, these signals can be poorly conditioned, leading to unstable optimization and harming the model's generalization. We argue that unlearning should instead prioritize distributional indistinguishability, aligning the model's behavior on forget data with its behavior on truly unseen data. Motivated by this, we propose Reference-Guided Unlearning (ReGUn), a framework that leverages a disjoint held-out dataset to provide a principled, class-conditioned reference for distillation. We demonstrate across various model architectures, natural image datasets, and varying forget fractions that ReGUn consistently outperforms standard…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Adversarial Robustness in Machine Learning
