DREditor: An Time-efficient Approach for Building a Domain-specific Dense Retrieval Model
Chen Huang, Duanyu Feng, Wenqiang Lei, Jiancheng Lv

TL;DR
DREditor is a novel, time-efficient method for customizing dense retrieval models to specific domains by calibrating output embeddings through a linear mapping, significantly reducing calibration time while maintaining or improving performance.
Contribution
The paper introduces DREditor, a new embedding calibration approach that enables rapid domain adaptation of dense retrieval models without extensive fine-tuning.
Findings
DREditor achieves 100-300x time efficiency improvement.
It maintains or surpasses the retrieval performance of traditional fine-tuning.
The approach is effective across various datasets, models, and hardware.
Abstract
Deploying dense retrieval models efficiently is becoming increasingly important across various industries. This is especially true for enterprise search services, where customizing search engines to meet the time demands of different enterprises in different domains is crucial. Motivated by this, we develop a time-efficient approach called DREditor to edit the matching rule of an off-the-shelf dense retrieval model to suit a specific domain. This is achieved by directly calibrating the output embeddings of the model using an efficient and effective linear mapping. This mapping is powered by an edit operator that is obtained by solving a specially constructed least squares problem. Compared to implicit rule modification via long-time finetuning, our experimental results show that DREditor provides significant advantages on different domain-specific datasets, dataset sources, retrieval…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Domain Adaptation and Few-Shot Learning · Advanced Image and Video Retrieval Techniques
