More Robust Dense Retrieval with Contrastive Dual Learning
Yizhi Li, Zhenghao Liu, Chenyan Xiong, Zhiyuan Liu

TL;DR
This paper introduces DANCE, a contrastive dual learning approach that enhances dense retrieval by learning fine-grained query representations, resulting in a more uniform embedding space and improved retrieval performance.
Contribution
DANCE is a novel training paradigm that jointly optimizes query and document embeddings, improving the quality of query representations for dense retrieval.
Findings
DANCE achieves superior ranking performance on MS MARCO.
It produces a more uniform and discriminative embedding space.
Enhanced query representations lead to better retrieval accuracy.
Abstract
Dense retrieval conducts text retrieval in the embedding space and has shown many advantages compared to sparse retrieval. Existing dense retrievers optimize representations of queries and documents with contrastive training and map them to the embedding space. The embedding space is optimized by aligning the matched query-document pairs and pushing the negative documents away from the query. However, in such training paradigm, the queries are only optimized to align to the documents and are coarsely positioned, leading to an anisotropic query embedding space. In this paper, we analyze the embedding space distributions and propose an effective training paradigm, Contrastive Dual Learning for Approximate Nearest Neighbor (DANCE) to learn fine-grained query representations for dense retrieval. DANCE incorporates an additional dual training object of query retrieval, inspired by the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Advanced Image and Video Retrieval Techniques · Text and Document Classification Technologies
MethodsDomain Adaptative Neighborhood Clustering via Entropy Optimization
