IncDSI: Incrementally Updatable Document Retrieval
Varsha Kishore, Chao Wan, Justin Lovelace, Yoav Artzi, Kilian Q., Weinberger

TL;DR
IncDSI introduces a real-time, incremental update method for differentiable search index models, enabling fast addition of new documents without full retraining, thus facilitating dynamic and up-to-date document retrieval systems.
Contribution
The paper presents IncDSI, a novel approach that allows real-time document addition to differentiable search index models through minimal parameter adjustments, avoiding costly retraining.
Findings
IncDSI achieves 20-50ms per document addition.
The method maintains competitive retrieval performance.
It enables real-time updates in document retrieval systems.
Abstract
Differentiable Search Index is a recently proposed paradigm for document retrieval, that encodes information about a corpus of documents within the parameters of a neural network and directly maps queries to corresponding documents. These models have achieved state-of-the-art performances for document retrieval across many benchmarks. These kinds of models have a significant limitation: it is not easy to add new documents after a model is trained. We propose IncDSI, a method to add documents in real time (about 20-50ms per document), without retraining the model on the entire dataset (or even parts thereof). Instead we formulate the addition of documents as a constrained optimization problem that makes minimal changes to the network parameters. Although orders of magnitude faster, our approach is competitive with re-training the model on the whole dataset and enables the development of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Image Retrieval and Classification Techniques · Advanced Image and Video Retrieval Techniques
