Representation-Guided Parameter-Efficient LLM Unlearning
Zeguan Xiao, Lang Mo, Yun Chen, Lei Yang, Jiehui Zhao, Lili Yang, Guanhua Chen

TL;DR
This paper introduces REGLU, a novel representation-guided unlearning method for LLMs that improves the forget-retain balance by leveraging geometric properties of representation spaces.
Contribution
REGLU uses representation-guided initialization and orthogonal regularization to enhance unlearning precision and model utility, surpassing existing methods.
Findings
REGLU outperforms state-of-the-art baselines on TOFU and WMDP benchmarks.
REGLU achieves better unlearning quality with higher model utility.
The method effectively disentangles forget and retain information in LLMs.
Abstract
Large Language Models (LLMs) often memorize sensitive or harmful information, necessitating effective machine unlearning techniques. While existing parameter-efficient unlearning methods have shown promise, they still struggle with the forget-retain trade-off. This can be attributed to their reliance on parameter importance metrics to identify parameters that are important exclusively for the forget set, which is fundamentally limited by the superposition phenomenon. Due to the polysemantic nature of LLM parameters, such an importance metric may struggle to disentangle parameters associated with the forget and retain sets. In this work, we propose Representation-Guided Low-rank Unlearning (REGLU), a novel approach that leverages the geometric properties of representation spaces to achieve robust and precise unlearning. First, we develop a representation-guided initialization for LoRA…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
