Targeted Unlearning with Single Layer Unlearning Gradient
Zikui Cai, Yaoteng Tan, M. Salman Asif

TL;DR
SLUG is a novel, efficient targeted unlearning method that updates a single critical layer in models like CLIP and Stable Diffusion, removing specific information with minimal computational cost.
Contribution
We introduce SLUG, a method that identifies and updates a single layer for targeted unlearning, reducing computational resources while maintaining model utility.
Findings
SLUG effectively removes specific information from models.
SLUG achieves comparable unlearning performance to existing methods.
SLUG requires significantly less computational resources.
Abstract
Machine unlearning methods aim to remove sensitive or unwanted content from trained models, but typically demand extensive model updates at significant computational cost while potentially degrading model performance on both related and unrelated tasks. We propose Single Layer Unlearning Gradient (SLUG) as an efficient method to unlearn targeted information by updating a single critical layer using a one-time gradient computation. SLUG uses layer importance and gradient alignment metrics to identify the optimal layer for targeted information removal while preserving the model utility. We demonstrate the effectiveness of SLUG for CLIP, Stable Diffusion, and vision-language models (VLMs) in removing concrete (e.g., identities and objects) and abstract concepts (e.g., artistic styles). On the UnlearnCanvas benchmark, SLUG achieves comparable unlearning performance to existing methods while…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsOptical Systems and Laser Technology · Photonic and Optical Devices · Optical and Acousto-Optic Technologies
MethodsContrastive Language-Image Pre-training
