LEMoE: Advanced Mixture of Experts Adaptor for Lifelong Model Editing of Large Language Models
Renzhi Wang, Piji Li

TL;DR
LEMoE introduces a novel Mixture of Experts adaptor for lifelong model editing of large language models, effectively addressing issues like catastrophic forgetting and routing inconsistency, and outperforming previous techniques in both lifelong and batch editing tasks.
Contribution
The paper proposes a new MoE adaptor with KV anchor routing and clustering-based editing order planning specifically designed for lifelong model editing.
Findings
LEMoE outperforms previous model editing methods in lifelong editing tasks.
The proposed approach maintains high performance in batch editing scenarios.
Enhanced routing consistency reduces catastrophic forgetting.
Abstract
Large language models (LLMs) require continual knowledge updates to stay abreast of the ever-changing world facts, prompting the formulation of lifelong model editing task. While recent years have witnessed the development of various techniques for single and batch editing, these methods either fail to apply or perform sub-optimally when faced with lifelong editing. In this paper, we introduce LEMoE, an advanced Mixture of Experts (MoE) adaptor for lifelong model editing. We first analyze the factors influencing the effectiveness of conventional MoE adaptor in lifelong editing, including catastrophic forgetting, inconsistent routing and order sensitivity. Based on these insights, we propose a tailored module insertion method to achieve lifelong editing, incorporating a novel KV anchor routing to enhance routing consistency between training and inference stage, along with a concise yet…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques
MethodsMixture of Experts
