WISE: Rethinking the Knowledge Memory for Lifelong Model Editing of   Large Language Models

Peng Wang; Zexi Li; Ningyu Zhang; Ziwen Xu; Yunzhi Yao; Yong Jiang,; Pengjun Xie; Fei Huang; Huajun Chen

arXiv:2405.14768·cs.CL·December 20, 2024·3 cites

WISE: Rethinking the Knowledge Memory for Lifelong Model Editing of Large Language Models

Peng Wang, Zexi Li, Ningyu Zhang, Ziwen Xu, Yunzhi Yao, Yong Jiang,, Pengjun Xie, Fei Huang, Huajun Chen

PDF

Open Access 1 Repo 1 Video

TL;DR

WISE introduces a dual memory scheme with a router and knowledge sharding to improve lifelong model editing of large language models, addressing reliability, generalization, and locality issues.

Contribution

The paper proposes WISE, a novel dual parametric memory system with a routing mechanism and knowledge sharding for effective lifelong model editing.

Findings

01

Outperforms previous editing methods across multiple tasks.

02

Effectively overcomes the impossible triangle in lifelong editing.

03

Works across various large language model architectures.

Abstract

Large language models (LLMs) need knowledge updates to meet the ever-growing world facts and correct the hallucinated responses, facilitating the methods of lifelong model editing. Where the updated knowledge resides in memories is a fundamental question for model editing. In this paper, we find that editing either long-term memory (direct model parameters) or working memory (non-parametric knowledge of neural network activations/representations by retrieval) will result in an impossible triangle -- reliability, generalization, and locality can not be realized together in the lifelong editing settings. For long-term memory, directly editing the parameters will cause conflicts with irrelevant pretrained knowledge or previous edits (poor reliability and locality). For working memory, retrieval-based activations can hardly make the model understand the edits and generalize (poor…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

zjunlp/easyedit
pytorchOfficial

Videos

WISE: Rethinking the Knowledge Memory for Lifelong Model Editing of Large Language Models· slideslive

Taxonomy

TopicsModel-Driven Software Engineering Techniques · Topic Modeling · Natural Language Processing Techniques

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Cosine Annealing · Attention Dropout · Linear Layer · Multi-Head Attention · Residual Connection · Weight Decay · Linear Warmup With Cosine Annealing · Byte Pair Encoding