Disentangled Modeling of Domain and Relevance for Adaptable Dense Retrieval
Jingtao Zhan, Qingyao Ai, Yiqun Liu, Jiaxin Mao, Xiaohui Xie, Min, Zhang, Shaoping Ma

TL;DR
This paper introduces Disentangled Dense Retrieval (DDR), a novel framework that enhances domain adaptation in dense retrieval models by disentangling domain-invariant and domain-specific features, leading to improved cross-domain retrieval performance.
Contribution
The paper proposes DDR, which separates domain-invariant relevance modeling from domain-specific adaptation, enabling effective unsupervised training of domain modules and improved cross-domain retrieval.
Findings
DDR outperforms strong dense retrieval baselines across multiple domains.
DDR significantly surpasses traditional retrieval methods in various scenarios.
Disentangling features enhances domain adaptation and retrieval effectiveness.
Abstract
Recent advance in Dense Retrieval (DR) techniques has significantly improved the effectiveness of first-stage retrieval. Trained with large-scale supervised data, DR models can encode queries and documents into a low-dimensional dense space and conduct effective semantic matching. However, previous studies have shown that the effectiveness of DR models would drop by a large margin when the trained DR models are adopted in a target domain that is different from the domain of the labeled data. One of the possible reasons is that the DR model has never seen the target corpus and thus might be incapable of mitigating the difference between the training and target domains. In practice, unfortunately, training a DR model for each target domain to avoid domain shift is often a difficult task as it requires additional time, storage, and domain-specific data labeling, which are not always…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Topic Modeling · Multimodal Machine Learning Applications
MethodsDense Connections · Q-Learning · Convolution · Deep Q-Network · Random Ensemble Mixture
