Keys to Robust Edits: from Theoretical Insights to Practical Advances

Jianhao Yan; Futing Wang; Yun Luo; Yafu Li; Yue Zhang

arXiv:2410.09338·cs.CL·May 23, 2025

Keys to Robust Edits: from Theoretical Insights to Practical Advances

Jianhao Yan, Futing Wang, Yun Luo, Yafu Li, Yue Zhang

PDF

Open Access 1 Video 3 Reviews

TL;DR

This paper introduces the Robust Edit Pathway (REP), a novel module that enhances large language model editing by balancing robustness and specificity, leading to significant improvements in knowledge editing success rates.

Contribution

The paper presents REP, a plug-and-play module that disentangles editing keys from native representations and uses contrastive learning to improve robustness and specificity in model editing.

Findings

01

REP improves success rate over robustness tests by up to 66.4%.

02

REP maintains success rate while enhancing robustness.

03

Extensive experiments validate REP's effectiveness across models and datasets.

Abstract

Large language models (LLMs) struggle with maintaining accurate knowledge due to conflicting/outdated parametric memories. While locate-and-edit methods address this, their reliance on models' internal representations leads to robustness failures in long-context reasoning and paraphrased queries. We identify a fundamental limitation of locate-and-edit methods: existing semantic keys (for memory localization) cannot simultaneously satisfy robustness (context-invariant activation) and specificity (precise knowledge discrimination). Through theoretical error-bound analysis, we establish formal criteria for effective editing. Our solution introduces \textit{Robust Edit Pathway (REP)}, a plug-and-play module that: (1) disentangles editing keys from native model representations; (2) dynamically adjusts keys via contrastive learning to achieve robustness-specificity balance. Extensive…

Peer Reviews

Decision·ICLR 2025 Conference Withdrawn Submission

Reviewer 01Rating 6Confidence 3

Strengths

1. The proposed REP framework uses a project and gate mechanism and separates editing pathways, providing a new approach to the knowledge editing task. 2. The paper combines theoretical derivation and empirical evidence. 3. The paper reports experiments and shows the proposed REP works well on a representative locate-then-edit method ROME.

Weaknesses

1. The experiments mainly use Llama-2-7B and Mistral-7B. Maybe more LLMs can be included and tested. 2. The experiments solely use the CounterFact dataset. We expect to see the results of real-world knowledge editing. This is because knowledge editing in the real world tends to be more diverse and complex. 3. What are the failure cases of REP? Including failure cases of editing and robustness tests. 4. It seems the experiment section only reports results with ROME and ignores MEMIT.

Reviewer 02Rating 5Confidence 3

Strengths

Knowledge editing is important for improving the quality of LLMs Related work seems well covered and reviewed The solution proposed sheds light on the key-value approaches The paper is largely written, or heavily edited, by a LLM

Weaknesses

The nature of key-value data model would require an assessment of the conceptual modelling issues involved in knowledge editing A lot of knowledge is not stored in individual triple, but in subgraphs English issues and instability of terminology/definitions Few, poor examples do not help the reader, e.g., the triple (USA, president-of, Biden) is counterintuitive unless we interpret "president-of" a "has-president"

Reviewer 03Rating 3Confidence 4

Strengths

S1. The authors provided the code for their approach. However, they could have written a README to understand the content better. Besides, the code is not anonymized, with paths containing the programmer's name (hardcoded paths are not a good sign of reproducibility). S2. The subject is well-motivated at the beginning. S3. In several places, the authors try to link the practical and the theoretical side of their work.

Weaknesses

W1. The authors need to improve the presentation and clarity of the paper. There are many typos in the paper (see below), and some are in the theoretical results, which is concerning. Sections 3 and 4 take a lot of work to follow. For Section 3, the authors copy-pasted the content of ROME. However, it is tough to understand without the context of the ROME paper. Section 4 is a succession of Lemmas and corollaries barely connected. The authors need to explain the intuition better. I provide addit

Videos

Keys to Robust Edits: From Theoretical Insights to Practical Advances· underline

Taxonomy

TopicsDigital Humanities and Scholarship