Perturbation-Restrained Sequential Model Editing

Jun-Yu Ma; Hong Wang; Hao-Xiang Xu; Zhen-Hua Ling; Jia-Chen Gu

arXiv:2405.16821·cs.CL·March 4, 2025

Perturbation-Restrained Sequential Model Editing

Jun-Yu Ma, Hong Wang, Hao-Xiang Xu, Zhen-Hua Ling, Jia-Chen Gu

PDF

Open Access 1 Repo 3 Reviews

TL;DR

This paper introduces PRUNE, a framework that applies condition number restraints during sequential model editing to preserve the general abilities of large language models while updating specific knowledge.

Contribution

The paper proposes a novel method, PRUNE, that uses condition number restraints to balance model editing effectiveness and preservation of general abilities.

Findings

01

PRUNE effectively maintains model general abilities during sequential edits.

02

Experimental results show PRUNE outperforms existing methods in preserving capabilities.

03

PRUNE achieves this without sacrificing editing performance across multiple tasks.

Abstract

Model editing is an emerging field that focuses on updating the knowledge embedded within large language models (LLMs) without extensive retraining. However, current model editing methods significantly compromise the general abilities of LLMs as the number of edits increases, and this trade-off poses a substantial challenge to the continual learning of LLMs. In this paper, we first theoretically analyze that the factor affecting the general abilities in sequential model editing lies in the condition number of the edited matrix. The condition number of a matrix represents its numerical sensitivity, and therefore can be used to indicate the extent to which the original knowledge associations stored in LLMs are perturbed after editing. Subsequently, statistical findings demonstrate that the value of this factor becomes larger as the number of edits increases, thereby exacerbating the…

Peer Reviews

Decision·ICLR 2025 Poster

Reviewer 01Rating 8Confidence 2

Strengths

* Sequential editing is indeed a crucial research issue in knowledge editing, and this paper examines it from a relatively comprehensive perspective while proposing an excellent solution. * The literature review is thorough. * The evaluation scope is extensive, based on three representative LLMs, including GPT-2 XL, LLaMA-2, and LLaMA-3. It also includes four representative downstream tasks—reasoning, summarization, open-domain question answering, and natural language inference—to broadly demo

Weaknesses

1. It would be helpful to report the results on the "Probability" metric, which is described in both [1] and [2]. I would like to know if applying the proposed framework to impose constraints on sequential editing affects the generalization and ripple effects of the edits themselves. 2. It would be more comprehensive if we could see some results on larger-scale models, such as at least those with 13B parameters or more. 3. Methods like MEMIT support batch editing, so it would be worthwhile to

Reviewer 02Rating 6Confidence 4

Strengths

1. Knowledge editing is an important topic, and addressing sequential editing is a challenging yet meaningful direction. 2. This paper is well motivated and the design of PRUNE clearly explained. 3. This paper proposes a plug-and-play method, which demonstrates significant improvements in sequential editing across various models.

Weaknesses

1. Some important details need further clarification. For example, in the editing setup, the key $k$ in the key-value pair $(k, v)$ typically does not change, and usually only $W$ and $v$ are modified. It is unclear why the paper focuses on analyzing the changes in $k_i$ instead of $v$ or $W$. 2. The PRUNE method imposes constraints on parameters (as described in Equations (3), (4), and (5)), which might potentially reduce the model’s ability to retain other capabilities or preserve historica

Reviewer 03Rating 6Confidence 4

Strengths

The paper introduces a theoretically derived upper bound for weight modification. By defining this threshold, the authors provide a quantitative limit which when exceeded, triggers degradation in the model’s original functions and leads to edit forgetting. This contribution deepens understanding of the trade-offs involved in weight modification and offers a practical guideline for preserving model performance. The authors demonstrate that PRUNE effectively mitigates the negative impacts of sequ

Weaknesses

The study focuses solely on the Llama-2 (7B) model, with a relatively small set of samples for evaluation which is a concern as editing in large numbers might exacerbate the impact of the shift that needs to be controlled. To broaden the scope, the authors could consider using a smaller model from the GPT series or a single editing approach, allowing for an expanded sample size under computational constraints. This could strengthen the study by providing more robust evidence for PRUNE’s effectiv

Code & Models

Repositories

mjy1111/prune
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsModel-Driven Software Engineering Techniques · Simulation Techniques and Applications · Reinforcement Learning in Robotics