Disentangled Modeling of Domain and Relevance for Adaptable Dense   Retrieval

Jingtao Zhan; Qingyao Ai; Yiqun Liu; Jiaxin Mao; Xiaohui Xie; Min; Zhang; Shaoping Ma

arXiv:2208.05753·cs.IR·August 18, 2022·6 cites

Disentangled Modeling of Domain and Relevance for Adaptable Dense Retrieval

Jingtao Zhan, Qingyao Ai, Yiqun Liu, Jiaxin Mao, Xiaohui Xie, Min, Zhang, Shaoping Ma

PDF

Open Access 1 Repo

TL;DR

This paper introduces Disentangled Dense Retrieval (DDR), a novel framework that enhances domain adaptation in dense retrieval models by disentangling domain-invariant and domain-specific features, leading to improved cross-domain retrieval performance.

Contribution

The paper proposes DDR, which separates domain-invariant relevance modeling from domain-specific adaptation, enabling effective unsupervised training of domain modules and improved cross-domain retrieval.

Findings

01

DDR outperforms strong dense retrieval baselines across multiple domains.

02

DDR significantly surpasses traditional retrieval methods in various scenarios.

03

Disentangling features enhances domain adaptation and retrieval effectiveness.

Abstract

Recent advance in Dense Retrieval (DR) techniques has significantly improved the effectiveness of first-stage retrieval. Trained with large-scale supervised data, DR models can encode queries and documents into a low-dimensional dense space and conduct effective semantic matching. However, previous studies have shown that the effectiveness of DR models would drop by a large margin when the trained DR models are adopted in a target domain that is different from the domain of the labeled data. One of the possible reasons is that the DR model has never seen the target corpus and thus might be incapable of mitigating the difference between the training and target domains. In practice, unfortunately, training a DR model for each target domain to avoid domain shift is often a difficult task as it requires additional time, storage, and domain-specific data labeling, which are not always…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

jingtaozhan/disentangled-retriever
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Topic Modeling · Multimodal Machine Learning Applications

MethodsDense Connections · Q-Learning · Convolution · Deep Q-Network · Random Ensemble Mixture