Uncovering Overfitting in Large Language Model Editing

Mengqi Zhang; Xiaotian Ye; Qiang Liu; Pengjie Ren; Shu Wu; Zhumin Chen

arXiv:2410.07819·cs.CL·June 18, 2025

Uncovering Overfitting in Large Language Model Editing

Mengqi Zhang, Xiaotian Ye, Qiang Liu, Pengjie Ren, Shu Wu, Zhumin Chen

PDF

Open Access 3 Reviews

TL;DR

This paper identifies a phenomenon called Editing Overfit in large language model editing, introduces a benchmark EVOKE to evaluate it, and proposes a new method LTI to mitigate overfitting and improve knowledge editing generalization.

Contribution

The paper uncovers Editing Overfit in LLM editing, introduces EVOKE benchmark and metrics, and proposes LTI strategy to reduce overfitting and enhance knowledge transfer.

Findings

01

Editing Overfit is common in current methods.

02

Existing mitigation strategies are ineffective against overfitting.

03

LTI significantly improves knowledge editing performance.

Abstract

Knowledge editing has been proposed as an effective method for updating and correcting the internal knowledge of Large Language Models (LLMs). However, existing editing methods often struggle with complex tasks, such as multi-hop reasoning. In this paper, we identify and investigate the phenomenon of Editing Overfit, where edited models assign disproportionately high probabilities to the edit target, hindering the generalization of new knowledge in complex scenarios. We attribute this issue to the current editing paradigm, which places excessive emphasis on the direct correspondence between the input prompt and the edit target for each edit sample. To further explore this issue, we introduce a new benchmark, EVOKE (EValuation of Editing Overfit in Knowledge Editing), along with fine-grained evaluation metrics. Through comprehensive experiments and analysis, we demonstrate that Editing…

Peer Reviews

Decision·ICLR 2025 Spotlight

Reviewer 01Rating 8Confidence 4

Strengths

- The overfitting issue in parameter-modifying knowledge editing methods identified by the authors is an interesting finding. - The proposed benchmark contributes to a more comprehensive analysis of the effectiveness of knowledge editing methods. - The authors provide extensive experimental results, conducting a detailed analysis of previous parameter-modifying knowledge editing methods and testing various previously suggested mitigation techniques. - The LTI method proposed by the authors is

Weaknesses

- The proposed LTI method relies on the model's in-context learning capability. This approach requires the unedited model to generate correct answers based solely on the context of the new knowledge, thereby providing an accurate representation and output distribution constraint for the editing process. This implies that for smaller models less proficient in in-context learning, or for models particularly rigid regarding certain facts, the constraints provided by the LTI method may negatively im

Reviewer 02Rating 8Confidence 4

Strengths

- The paper clearly identifies Editing Overfit as a pervasive issue within existing LLM editing methods, providing evidence through extensive experimentation and analysis. The identification of this new key problem is one of the major contribution of this paper. - Besides identifying the problem, the paper also introduces the EVOKE benchmark with fine-grained evaluation metrics, allowing for a more systematic and comprehensive evaluation of the Editing Overfit problem. - The paper also careful

Weaknesses

- It seems that LTI is only applicable in cases where the edited knowledge can be explicitly represented as knowledge triples (s, r, o, o*). However, more complex or nuanced knowledge editing tasks may not fit into this structured format, potentially limiting LTI's applicability in real-world scenarios. - The phenomenon of Editing Overfit and the proposed LTI approach are only tested on small-scale LLMs such as GPT-J and GPT-2. Extending the analysis to larger models, such as LLaMA-2/3 or other

Reviewer 03Rating 6Confidence 3

Strengths

1. The paper's exploration of the overfitting problem in knowledge editing is both highly interesting and valuable, and the design of a dedicated benchmark provides an effective way to investigate it in depth. 2. The work is highly comprehensive, identifying an unexplored issue, evaluating it within multi-hop reasoning, and proposing a plug-and-play solution based on observed phenomena. 3. The writing is very clear, and the experimental design is comprehensive and well-aligned with the motivat

Weaknesses

1. I believe Section 5 may not be essential to the overall paper. While it contributes some useful experimental insights with basic mitigation techniques, it does not appear closely aligned with the paper's primary contributions. Instead, it occupies space that could be better devoted to the LTI section, which I consider more significant. 2. I understand that this work focuses on the overfitting issue in model editing. However, to my knowledge, model editing itself performs poorly on multi-h

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Model-Driven Software Engineering Techniques