TL;DR
HiEdit introduces a hierarchical reinforcement learning approach to selectively and efficiently edit large language models by identifying and updating only the most relevant layers for each knowledge correction.
Contribution
The paper proposes HiEdit, a novel framework that adaptively selects model layers for editing, improving precision and reducing side effects compared to existing static methods.
Findings
HiEdit outperforms RLEdit by an average of 8.48% in editing performance.
It achieves effective knowledge editing by perturbing only half of the model layers.
The approach enhances adaptability and minimizes unintended side effects.
Abstract
Lifelong model editing (LME) aims to sequentially rectify outdated or inaccurate knowledge in deployed LLMs while minimizing side effects on unrelated inputs. However, existing approaches typically apply parameter perturbations to a static and dense set of LLM layers for all editing instances. This practice is counter-intuitive, as we hypothesize that different pieces of knowledge are stored in distinct layers of the model. Neglecting this layer-wise specificity can impede adaptability in integrating new knowledge and result in catastrophic forgetting for both general and previously edited knowledge. To address this, we propose HiEdit, a hierarchical reinforcement learning framework that adaptively identifies the most knowledge-relevant layers for each editing instance. By enabling dynamic, instance-aware layer selection and incorporating an intrinsic reward for sparsity, HiEdit…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
