Machine Unlearning on Pre-trained Models by Residual Feature Alignment Using LoRA
Laiqiao Qin, Tianqing Zhu, Linlin Wang, Wanlei Zhou

TL;DR
This paper introduces a novel, efficient machine unlearning method for pre-trained models that uses residual feature alignment with LoRA to selectively unlearn data while preserving model utility.
Contribution
It proposes Residual Feature Alignment Unlearning, leveraging LoRA to adjust residual features for effective unlearning without full model fine-tuning.
Findings
Effective unlearning demonstrated on multiple datasets.
Maintains model performance on retained data.
Reduces computational costs compared to full fine-tuning.
Abstract
Machine unlearning is an emerging technology that removes a subset of the training data from a trained model without significantly affecting the model performance on the remaining data. This topic is becoming increasingly important in protecting user privacy and eliminating harmful or outdated data. The key challenge lies in effectively and efficiently unlearning specific information without compromising the model's utility on the retained data. For pre-trained models, fine-tuning is an important way to achieve the unlearning target. Previous work typically fine-tuned the entire model's parameters, which incurred significant computational costs. In addition, the fine-tuning process may cause shifts in the intermediate layer features, affecting the model's overall utility. In this work, we propose a novel and efficient machine unlearning method for pre-trained models. We term the method…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
