TL;DR
This paper introduces a two-phase optimization framework for machine unlearning that effectively handles entanglements between forget and retain samples, ensuring accurate removal and retention of information.
Contribution
It presents a novel two-phase method combining augmented Lagrangian and Wasserstein-regularized gradient projection to improve unlearning performance amidst sample entanglements.
Findings
Outperforms existing methods in accuracy retention and removal fidelity.
Effective across multiple datasets and neural architectures.
Achieves reliable unlearning with minimal performance degradation.
Abstract
Forgetting a subset in machine unlearning is rarely an isolated task. Often, retained samples that are closely related to the forget set can be unintentionally affected, particularly when they share correlated features from pretraining or exhibit strong semantic similarities. To address this challenge, we propose a novel two-phase optimization framework specifically designed to handle such retai-forget entanglements. In the first phase, an augmented Lagrangian method increases the loss on the forget set while preserving accuracy on less-related retained samples. The second phase applies a gradient projection step, regularized by the Wasserstein-2 distance, to mitigate performance degradation on semantically related retained samples without compromising the unlearning objective. We validate our approach through comprehensive experiments on multiple unlearning tasks, standard benchmark…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
