SalUn: Empowering Machine Unlearning via Gradient-based Weight Saliency in Both Image Classification and Generation
Chongyu Fan, Jiancheng Liu, Yihua Zhang, Eric Wong, Dennis Wei, Sijia, Liu

TL;DR
SalUn introduces a gradient-based weight saliency approach for machine unlearning, effectively erasing data influence in image classification and generation, with improved accuracy, stability, and cross-domain applicability.
Contribution
This paper presents SalUn, the first principled unlearning method utilizing weight saliency to enhance effectiveness and efficiency across diverse AI tasks.
Findings
Achieves nearly exact unlearning in image classification.
Outperforms state-of-the-art baselines in harmful image prevention.
Demonstrates stability in high-variance data forgetting.
Abstract
With evolving data regulations, machine unlearning (MU) has become an important tool for fostering trust and safety in today's AI models. However, existing MU methods focusing on data and/or weight perspectives often suffer limitations in unlearning accuracy, stability, and cross-domain applicability. To address these challenges, we introduce the concept of 'weight saliency' for MU, drawing parallels with input saliency in model explanation. This innovation directs MU's attention toward specific model weights rather than the entire model, improving effectiveness and efficiency. The resultant method that we call saliency unlearning (SalUn) narrows the performance gap with 'exact' unlearning (model retraining from scratch after removing the forgetting data points). To the best of our knowledge, SalUn is the first principled MU approach that can effectively erase the influence of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsCOVID-19 diagnosis using AI · Advanced Neural Network Applications · Adversarial Robustness in Machine Learning
MethodsDiffusion
