Resolving Lexical Bias in Model Editing

Hammad Rizwan; Domenic Rosati; Ga Wu; Hassan Sajjad

arXiv:2408.10411·cs.CL·October 10, 2025

Resolving Lexical Bias in Model Editing

Hammad Rizwan, Domenic Rosati, Ga Wu, Hassan Sajjad

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces PENME, a novel model editing method that learns a disentangled representation space to improve the precision and efficiency of editing large language models, addressing lexical bias issues.

Contribution

The paper proposes a new approach to model editing that disentangles representations for better localization and robustness, outperforming previous methods in accuracy and efficiency.

Findings

01

Achieves state-of-the-art editing performance

02

More computationally efficient during inference

03

Effective across different model architectures

Abstract

Model editing aims to modify the outputs of large language models after they are trained. Previous approaches have often involved direct alterations to model weights, which can result in model degradation. Recent techniques avoid making modifications to the model's weights by using an adapter that applies edits to the model when triggered by semantic similarity in the representation space. We demonstrate that current adapter methods are critically vulnerable to strong lexical biases, leading to issues such as applying edits to irrelevant prompts with overlapping words. This paper presents a principled approach to learning a disentangled representation space that facilitates precise localization of edits by maintaining distance between irrelevant prompts while preserving proximity among paraphrases. In our empirical study, we show that our method (Projector Editor Networks for Model…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

hammadrizwan/PENME
pytorchOfficial

Videos

Resolving Lexical Bias in Model Editing· slideslive

Taxonomy

TopicsNatural Language Processing Techniques · Text Readability and Simplification

MethodsAdapter · Balanced Selection · Contrastive Learning