Disentangling Knowledge Representations for Large Language Model Editing
Mengqi Zhang, Zisheng Zhou, Xiaotian Ye, Qiang Liu, Zhaochun Ren, Zhumin Chen, Pengjie Ren

TL;DR
This paper introduces DiKE, a novel method for editing large language models that effectively disentangles knowledge representations to preserve irrelevant facts while updating specific knowledge, improving precision and efficiency.
Contribution
DiKE is the first approach to explicitly disentangle subject representations for targeted knowledge editing, enhancing fine-grained irrelevant knowledge preservation in LLMs.
Findings
DiKE significantly improves irrelevant knowledge preservation.
It maintains competitive general editing performance.
The approach is efficient and minimally invasive.
Abstract
Knowledge Editing has emerged as a promising solution for efficiently updating embedded knowledge in large language models (LLMs). While existing approaches demonstrate effectiveness in integrating new knowledge and preserving the original capabilities of LLMs, they fail to maintain fine-grained irrelevant knowledge, namely facts that share the same subject as edited knowledge but differ in relation and object. This challenge arises because subject representations inherently encode multiple attributes, causing the target and fine-grained irrelevant knowledge to become entangled in the representation space, and thus vulnerable to unintended alterations during editing. To address this, we propose DiKE, a novel approach that Disentangles Knowledge representations for LLM Editing (DiKE). DiKE consists of two key components: a Knowledge Representation Disentanglement (KRD) module that…
Peer Reviews
Decision·ICLR 2026 Poster
- This paper is overall clearly written. - The experiments cover a range of benchmark, baselines, and the ablation studies help understand each component of the method. - Overall the experiment results are good, which demonstrates the effectiveness of the proposed method.
- **Disentangler Dependence**: KRD/DiKE relies on disentanglement module quality. Poor performance (e.g., out-of-domain facts) risks reduced edit efficacy or corruption. Incoporating more case study will provide a more comprehensive understanding of DiKE - **Benchmark Gaps**: FINE-KED is restricted to explicit subject-relation-object facts with heuristic+limited human-validated similarity. It lacks evaluation of open-domain/paraphrased knowledge generalization. While low pretraining-evaluation o
Please see my comments above.
Please see my comments above.
* This paper highlights the problem that current model editing methods fail to preserve fine-grained irrelevant knowledge. * DiKE’s approach of disentangling fine-grained relevant and irrelevant knowledge before performing editing is novel and interesting. * The paper demonstrates the effectiveness of DiKE through experiments, and notably, DiKE achieves impressive results in multi-hop editing, indicating its potential for multi-hop knowledge editing.
* DiKE seems to focus too heavily on locality. However, the efficacy and generalization of editing methods are also crucial aspects. * DiKE appears to have limited scalability. * DiKE seems applicable only to factual knowledge that contains a subject, which may restrict its applicability.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling
