The Curse of Popularity: Popular Entities have Catastrophic Side Effects when Deleting Knowledge from Language Models
Ryosuke Takahashi, Go Kamoda, Benjamin Heinzerling, Keisuke Sakaguchi,, Kentaro Inui

TL;DR
This paper investigates the risks of deleting knowledge from language models, revealing that removing information about popular entities can cause severe side effects, and introduces analysis using synthetic knowledge graphs for controlled experiments.
Contribution
It is the first to analyze knowledge deletion effects on models trained with synthetic knowledge graphs, highlighting the catastrophic side effects associated with popular entities.
Findings
Deleting popular entity knowledge can cause catastrophic side effects
Knowledge deletion impacts are more severe for popular entities
Synthetic knowledge graphs enable controlled experiments on knowledge removal
Abstract
Language models (LMs) encode world knowledge in their internal parameters through training. However, LMs may learn personal and confidential information from the training data, leading to privacy concerns such as data leakage. Therefore, research on knowledge deletion from LMs is essential. This study focuses on the knowledge stored in LMs and analyzes the relationship between the side effects of knowledge deletion and the entities related to the knowledge. Our findings reveal that deleting knowledge related to popular entities can have catastrophic side effects. Furthermore, this research is the first to analyze knowledge deletion in models trained on synthetic knowledge graphs, indicating a new direction for controlled experiments.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsComputational and Text Analysis Methods
