Toward Reliable Machine Unlearning: Theory, Algorithms, and Evaluation
Ali Ebrahimpour-Boroojeny

TL;DR
This paper introduces new methods for machine unlearning that outperform existing techniques by leveraging prediction similarity, adversarial fine-tuning, and model smoothness, with theoretical insights and practical algorithms validated on benchmarks.
Contribution
It proposes Adversarial Machine UNlearning (AMUN), FastClip for smooth model training, and Tilted ReWeighting (TRW) for class unlearning, advancing the state-of-the-art in reliability and security.
Findings
AMUN surpasses prior SOTA in image classification unlearning.
FastClip enables scalable training of smooth models with spectral-norm clipping.
TRW effectively mitigates membership inference attacks in class unlearning.
Abstract
We propose new methodologies for both unlearning random set of samples and class unlearning and show that they outperform existing methods. The main driver of our unlearning methods is the similarity of predictions to a retrained model on both the forget and remain samples. We introduce Adversarial Machine UNlearning (AMUN), which surpasses prior state-of-the-art methods for image classification based on SOTA MIA scores. AMUN lowers the model's confidence on forget samples by fine-tuning on their corresponding adversarial examples. Through theoretical analysis, we identify factors governing AMUN's performance, including smoothness. To facilitate training of smooth models with a controlled Lipschitz constant, we propose FastClip, a scalable method that performs layer-wise spectral-norm clipping of affine layers. In a separate study, we show that increased smoothness naturally improves…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Domain Adaptation and Few-Shot Learning · Advanced Image Processing Techniques
