How Data Inter-connectivity Shapes LLMs Unlearning: A Structural Unlearning Perspective
Xinchi Qiu, William F. Shen, Yihong Chen, Meghdad Kurmanji, Nicola, Cancedda, Pontus Stenetorp, Nicholas D. Lane

TL;DR
This paper introduces PISTOL, a method for creating structured datasets that reveal how inter-connectivity in data affects the difficulty of unlearning in large language models, highlighting challenges in balancing performance across domains.
Contribution
The paper presents PISTOL, a novel dataset compilation method that incorporates data inter-connectivity to study its effects on LLM unlearning, addressing limitations of previous independent data assumptions.
Findings
Unlearning difficulty increases with data inter-connectivity.
Higher knowledge graph density correlates with greater unlearning difficulty.
Skewed domain data makes balancing performance across domains more challenging.
Abstract
While unlearning knowledge from large language models (LLMs) is receiving increasing attention, one important aspect remains unexplored. Existing approaches and benchmarks assume data points to-be-forgotten are independent, ignoring their inter-connectivity - a fundamental characteristic of real-world data structures. In this paper, we propose PISTOL, a method for compiling structural datasets. PISTOL leverages the inherently structured nature of contractual relationships, offering several key benefits. First, it enables insights into the impact of structural data on unlearning effectiveness. Second, it provides precise and concise ground truths for clearer evaluation. Third, its attribute generation does not require input from pre-trained LLMs, mitigating confounding risks. Leveraging datasets synthesized using PISTOL, we demonstrate how data inter-connectivity impacts LLM unlearning.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMathematics, Computing, and Information Processing · Open Education and E-Learning · Natural Language Processing Techniques
