Disentangling Knowledge Representations for Large Language Model Editing

Mengqi Zhang; Zisheng Zhou; Xiaotian Ye; Qiang Liu; Zhaochun Ren; Zhumin Chen; Pengjie Ren

arXiv:2505.18774·cs.CL·March 26, 2026

Disentangling Knowledge Representations for Large Language Model Editing

Mengqi Zhang, Zisheng Zhou, Xiaotian Ye, Qiang Liu, Zhaochun Ren, Zhumin Chen, Pengjie Ren

PDF

Open Access 3 Reviews

TL;DR

This paper introduces DiKE, a novel method for editing large language models that effectively disentangles knowledge representations to preserve irrelevant facts while updating specific knowledge, improving precision and efficiency.

Contribution

DiKE is the first approach to explicitly disentangle subject representations for targeted knowledge editing, enhancing fine-grained irrelevant knowledge preservation in LLMs.

Findings

01

DiKE significantly improves irrelevant knowledge preservation.

02

It maintains competitive general editing performance.

03

The approach is efficient and minimally invasive.

Abstract

Knowledge Editing has emerged as a promising solution for efficiently updating embedded knowledge in large language models (LLMs). While existing approaches demonstrate effectiveness in integrating new knowledge and preserving the original capabilities of LLMs, they fail to maintain fine-grained irrelevant knowledge, namely facts that share the same subject as edited knowledge but differ in relation and object. This challenge arises because subject representations inherently encode multiple attributes, causing the target and fine-grained irrelevant knowledge to become entangled in the representation space, and thus vulnerable to unintended alterations during editing. To address this, we propose DiKE, a novel approach that Disentangles Knowledge representations for LLM Editing (DiKE). DiKE consists of two key components: a Knowledge Representation Disentanglement (KRD) module that…

Peer Reviews

Decision·ICLR 2026 Poster

Reviewer 01Rating 6Confidence 4

Strengths

- This paper is overall clearly written. - The experiments cover a range of benchmark, baselines, and the ablation studies help understand each component of the method. - Overall the experiment results are good, which demonstrates the effectiveness of the proposed method.

Weaknesses

- **Disentangler Dependence**: KRD/DiKE relies on disentanglement module quality. Poor performance (e.g., out-of-domain facts) risks reduced edit efficacy or corruption. Incoporating more case study will provide a more comprehensive understanding of DiKE - **Benchmark Gaps**: FINE-KED is restricted to explicit subject-relation-object facts with heuristic+limited human-validated similarity. It lacks evaluation of open-domain/paraphrased knowledge generalization. While low pretraining-evaluation o

Reviewer 02Rating 6Confidence 3

Strengths

Please see my comments above.

Weaknesses

Please see my comments above.

Reviewer 03Rating 4Confidence 5

Strengths

* This paper highlights the problem that current model editing methods fail to preserve fine-grained irrelevant knowledge. * DiKE’s approach of disentangling fine-grained relevant and irrelevant knowledge before performing editing is novel and interesting. * The paper demonstrates the effectiveness of DiKE through experiments, and notably, DiKE achieves impressive results in multi-hop editing, indicating its potential for multi-hop knowledge editing.

Weaknesses

* DiKE seems to focus too heavily on locality. However, the efficacy and generalization of editing methods are also crucial aspects. * DiKE appears to have limited scalability. * DiKE seems applicable only to factual knowledge that contains a subject, which may restrict its applicability.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling