Erasing Without Remembering: Implicit Knowledge Forgetting in Large Language Models

Huazheng Wang; Yongcheng Jing; Haifeng Sun; Yingjie Wang; Jingyu Wang; Jianxin Liao; Dacheng Tao

arXiv:2502.19982·cs.CL·October 10, 2025

Erasing Without Remembering: Implicit Knowledge Forgetting in Large Language Models

Huazheng Wang, Yongcheng Jing, Haifeng Sun, Yingjie Wang, Jingyu Wang, Jianxin Liao, Dacheng Tao

PDF

Open Access 1 Repo

TL;DR

This paper explores the challenge of forgetting implicit knowledge in large language models, proposing a novel method called PerMU that improves unlearning effectiveness and generalization across various datasets and model scales.

Contribution

It introduces PerMU, a probability perturbation-based unlearning approach that enhances implicit knowledge forgetting in large language models.

Findings

01

PerMU achieves up to 50.40% improvement in unlearning target data.

02

PerMU increases implicit knowledge forgetting by 40.73%.

03

Models still recall paraphrased answers after unlearning.

Abstract

In this paper, we investigate knowledge forgetting in large language models with a focus on its generalisation, ensuring that models forget not only specific training samples but also related implicit knowledge. To this end, we begin by identifying a broader unlearning scope that includes both target data and logically associated samples, including rephrased, subject-replaced, relation-reversed, and one-hop reasoned data. We then conduct a rigorous evaluation of 15 state-of-the-art methods across three datasets, revealing that unlearned models still recall paraphrased answers and retain target facts in their intermediate layers. This motivates us to take a preliminary step toward more generalised implicit knowledge forgetting by proposing PerMU, a novel probability perturbation-based unlearning paradigm. PerMU simulates adversarial unlearning samples to eliminate fact-related tokens…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

maybelizzy/ugbench
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling

MethodsFocus