Erasing Without Remembering: Implicit Knowledge Forgetting in Large Language Models
Huazheng Wang, Yongcheng Jing, Haifeng Sun, Yingjie Wang, Jingyu Wang, Jianxin Liao, Dacheng Tao

TL;DR
This paper explores the challenge of forgetting implicit knowledge in large language models, proposing a novel method called PerMU that improves unlearning effectiveness and generalization across various datasets and model scales.
Contribution
It introduces PerMU, a probability perturbation-based unlearning approach that enhances implicit knowledge forgetting in large language models.
Findings
PerMU achieves up to 50.40% improvement in unlearning target data.
PerMU increases implicit knowledge forgetting by 40.73%.
Models still recall paraphrased answers after unlearning.
Abstract
In this paper, we investigate knowledge forgetting in large language models with a focus on its generalisation, ensuring that models forget not only specific training samples but also related implicit knowledge. To this end, we begin by identifying a broader unlearning scope that includes both target data and logically associated samples, including rephrased, subject-replaced, relation-reversed, and one-hop reasoned data. We then conduct a rigorous evaluation of 15 state-of-the-art methods across three datasets, revealing that unlearned models still recall paraphrased answers and retain target facts in their intermediate layers. This motivates us to take a preliminary step toward more generalised implicit knowledge forgetting by proposing PerMU, a novel probability perturbation-based unlearning paradigm. PerMU simulates adversarial unlearning samples to eliminate fact-related tokens…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling
MethodsFocus
