Model Editing for New Document Integration in Generative Information Retrieval

Zhen Zhang; Zihan Wang; Xinyu Ma; Shuaiqiang Wang; Dawei Yin; Xin Xin; Pengjie Ren; Maarten de Rijke; Zhaochun Ren

arXiv:2603.02773·cs.IR·May 12, 2026

Model Editing for New Document Integration in Generative Information Retrieval

Zhen Zhang, Zihan Wang, Xinyu Ma, Shuaiqiang Wang, Dawei Yin, Xin Xin, Pengjie Ren, Maarten de Rijke, Zhaochun Ren

PDF

TL;DR

This paper introduces DOME, a model editing technique that enhances generative retrieval models' ability to incorporate new documents efficiently, reducing training time and maintaining retrieval accuracy.

Contribution

DOME provides a novel, effective method for adapting generative retrieval models to unseen documents through targeted parameter edits, improving scalability and efficiency.

Findings

01

DOME significantly improves retrieval performance on new documents.

02

DOME reduces training time by approximately 40% compared to incremental training.

03

Experiments on NQ and MS MARCO benchmarks validate DOME's effectiveness.

Abstract

Generative retrieval (GR) reformulates the Information Retrieval (IR) task as the generation of document identifiers (docIDs). Despite its promise, existing GR models exhibit poor generalization to newly added documents, often failing to generate the correct docIDs. While incremental training offers a straightforward remedy, it is computationally expensive, resource-intensive, and prone to catastrophic forgetting, thereby limiting the scalability and practicality of GR. In this paper, we identify the core bottleneck as the decoder's ability to map hidden states to the correct docIDs of newly added documents. Model editing, which enables targeted parameter modifications for docID mapping, represents a promising solution. However, applying model editing to current GR models is not trivial, which is severely hindered by indistinguishable edit vectors across queries, due to the high overlap…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.