TL;DR
This paper introduces GONE, a benchmark for evaluating knowledge unlearning in LLMs using structured knowledge graphs, and proposes NEDS, a novel framework leveraging graph connectivity for precise unlearning.
Contribution
The paper presents GONE, a structured knowledge graph benchmark for unlearning, and NEDS, a new method that improves unlearning accuracy by exploiting graph structure.
Findings
NEDS achieves perfect unlearning efficacy (1.000) on GONE.
NEDS significantly reduces reasoning-based leakage.
NEDS outperforms existing methods on multiple benchmarks.
Abstract
Unlearning knowledge is a pressing and challenging task in Large Language Models (LLMs) because of their unprecedented capability to memorize and digest training data at scale, raising more significant issues regarding safety, privacy, and intellectual property. However, existing works, including parameter editing, fine-tuning, and distillation-based methods, are all focused on flat sentence-level data but overlook the relational, multi-hop, and reasoned knowledge in naturally structured data. In response to this gap, this paper introduces Graph Oblivion and Node Erasure (GONE), a benchmark for evaluating knowledge unlearning over structured knowledge graph (KG) facts in LLMs. This KG-based benchmark enables the disentanglement of three effects of unlearning: direct fact removal, reasoning-based leakage, and catastrophic forgetting. In addition, Neighborhood-Expanded Distribution…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
