Class Unlearning via Depth-Aware Removal of Forget-Specific Directions
Arman Hatami, Romina Aalishah, Ilya E. Monosov

TL;DR
This paper introduces DAMP, a novel one-shot method for class unlearning that effectively removes forget-specific information from deep models without retraining, improving selective forgetting and utility preservation.
Contribution
DAMP is a depth-aware, projection-based weight-surgery technique that removes forget directions in a pretrained network without gradient optimization, extending to multi-class scenarios.
Findings
DAMP outperforms prior methods in selective forgetting across multiple datasets.
It better preserves retain-class performance compared to existing approaches.
DAMP reduces residual forget-class structure in deep representations.
Abstract
Machine unlearning aims to remove targeted knowledge from a trained model without the cost of retraining from scratch. In class unlearning, however, reducing accuracy on forget classes does not necessarily imply true forgetting: forgotten information can remain encoded in internal representations, and apparent forgetting may arise from classifier-head suppression rather than representational removal. We show that existing class-unlearning methods often exhibit weak or negative selectivity, preserve forget-class structure in deep representations, or rely heavily on final-layer bias shifts. We then introduce DAMP (Depth-Aware Modulation by Projection), a one-shot, closed-form weight-surgery method that removes forget-specific directions from a pretrained network without gradient-based optimization. At each stage, DAMP computes class prototypes in the input space of the next learnable…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
