Understanding the Collapse of LLMs in Model Editing
Wanli Yang, Fei Sun, Jiajun Tan, Xinyu Ma, Du Su, Dawei Yin, Huawei, Shen

TL;DR
This paper investigates why large language models often collapse after editing, identifies key causes related to key handling and token distribution, and proposes a simple method to prevent collapse while preserving edit effectiveness.
Contribution
It uncovers root causes of model collapse in LLM editing and introduces a straightforward approach to mitigate this issue effectively.
Findings
The identified causes include key handling inconsistencies and token distribution differences.
The proposed method prevents model collapse during editing.
The approach maintains the effectiveness of model edits.
Abstract
Despite significant progress in model editing methods, their application in real-world scenarios remains challenging as they often cause large language models (LLMs) to collapse. Among them, ROME is particularly concerning, as it could disrupt LLMs with only a single edit. In this paper, we study the root causes of such collapse. Through extensive analysis, we identify two primary factors that contribute to the collapse: i) inconsistent handling of prefixed and unprefixed keys in the parameter update equation may result in very small denominators, causing excessively large parameter updates; ii) the subject of collapse cases is usually the first token, whose unprefixed key distribution significantly differs from the prefixed key distribution in autoregressive transformers, causing the aforementioned issue to materialize. To validate our findings, we propose a simple yet effective…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDigital Rights Management and Security
MethodsRank-One Model Editing
