Understanding the Collapse of LLMs in Model Editing

Wanli Yang; Fei Sun; Jiajun Tan; Xinyu Ma; Du Su; Dawei Yin; Huawei; Shen

arXiv:2406.11263·cs.CL·October 1, 2024

Understanding the Collapse of LLMs in Model Editing

Wanli Yang, Fei Sun, Jiajun Tan, Xinyu Ma, Du Su, Dawei Yin, Huawei, Shen

PDF

Open Access 1 Repo

TL;DR

This paper investigates why large language models often collapse after editing, identifies key causes related to key handling and token distribution, and proposes a simple method to prevent collapse while preserving edit effectiveness.

Contribution

It uncovers root causes of model collapse in LLM editing and introduces a straightforward approach to mitigate this issue effectively.

Findings

01

The identified causes include key handling inconsistencies and token distribution differences.

02

The proposed method prevents model collapse during editing.

03

The approach maintains the effectiveness of model edits.

Abstract

Despite significant progress in model editing methods, their application in real-world scenarios remains challenging as they often cause large language models (LLMs) to collapse. Among them, ROME is particularly concerning, as it could disrupt LLMs with only a single edit. In this paper, we study the root causes of such collapse. Through extensive analysis, we identify two primary factors that contribute to the collapse: i) inconsistent handling of prefixed and unprefixed keys in the parameter update equation may result in very small denominators, causing excessively large parameter updates; ii) the subject of collapse cases is usually the first token, whose unprefixed key distribution significantly differs from the prefixed key distribution in autoregressive transformers, causing the aforementioned issue to materialize. To validate our findings, we propose a simple yet effective…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

wanliyoung/collapse-in-model-editing
pytorch

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDigital Rights Management and Security

MethodsRank-One Model Editing