In-Context Editing: Learning Knowledge from Self-Induced Distributions

Siyuan Qi; Bangcheng Yang; Kailin Jiang; Xiaobo Wang; Jiaqi Li; Yifan; Zhong; Yaodong Yang; Zilong Zheng

arXiv:2406.11194·cs.CL·April 1, 2025

In-Context Editing: Learning Knowledge from Self-Induced Distributions

Siyuan Qi, Bangcheng Yang, Kailin Jiang, Xiaobo Wang, Jiaqi Li, Yifan, Zhong, Yaodong Yang, Zilong Zheng

PDF

Open Access 2 Repos 1 Datasets 1 Video 3 Reviews

TL;DR

This paper presents Consistent In-Context Editing (ICE), a novel method enabling language models to efficiently incorporate new knowledge through in-context learning, improving robustness and avoiding overfitting without extensive retraining.

Contribution

ICE introduces a simple optimization framework that aligns model output distributions with and without additional context, enhancing knowledge editing capabilities.

Findings

01

ICE improves accuracy of knowledge editing.

02

ICE maintains linguistic quality and model integrity.

03

ICE demonstrates robustness across various editing scenarios.

Abstract

In scenarios where language models must incorporate new information efficiently without extensive retraining, traditional fine-tuning methods are prone to overfitting, degraded generalization, and unnatural language generation. To address these limitations, we introduce Consistent In-Context Editing (ICE), a novel approach leveraging the model's in-context learning capability to optimize toward a contextual distribution rather than a one-hot target. ICE introduces a simple yet effective optimization framework for the model to internalize new knowledge by aligning its output distributions with and without additional context. This method enhances the robustness and effectiveness of gradient-based tuning methods, preventing overfitting and preserving the model's integrity. We analyze ICE across four critical aspects of knowledge editing: accuracy, locality, generalization, and linguistic…

Peer Reviews

Decision·ICLR 2025 Poster

Reviewer 01Rating 8Confidence 4

Strengths

- The paper is clear, well-motivated and the idea is novel as far as I know. - Compared to other baselines, this method is the only one capable of effectively editing knowledge continually.

Weaknesses

- The pipeline is quite heavy, relying on sampling at every optimization step and GPT-4 for augmented contexts.

Reviewer 02Rating 6Confidence 5

Strengths

1. The method is novel for knowledge editing by identifying an issue with prior approaches for targeted knowledge editing. While there has been prior work on fine-tuning and naive work on in-context (prompt-based) knowledge editing, the combination of distilling the in-context editing directly into the parameters has not been done. 2. The empirical results, while not perfect on all metrics and datasets, show promise across the baseline methods presented and on the standard metrics and perplexit

Weaknesses

1. The method is similar to knowledge/context distillation or gisting, and so a connection should be drawn there. Still, applying this method appears novel for knowledge editing. However the lack of references to KD/gisting makes it hard to place how related (or not) this idea is to that line of work. [Snell et al., 2022](https://arxiv.org/abs/2209.15189) - Context Distillation [Mu et al., 2023](https://arxiv.org/abs/2304.08467) - Gisting 2. The paper advocates for conditioning on “context” t

Reviewer 03Rating 8Confidence 3

Strengths

The paper proposes an interesting methods that is likely to be useful for future applications. The paper does a good job at demonstrating the usefulness of the method.

Weaknesses

The paper only studies one base model, it is not clear how this generalizes to other model. In particular, I suspect that the size of the model and their base in-context capabilities might play an important role in the success of the method.

Code & Models

Repositories

Datasets

Yofuria/ICE
dataset· 7 dl
7 dl

Videos

In-Context Editing: Learning Knowledge from Self-Induced Distributions· slideslive

Taxonomy

TopicsMachine Learning and Algorithms · Data Stream Mining Techniques