A Thorough Examination on Zero-shot Dense Retrieval
Ruiyang Ren, Yingqi Qu, Jing Liu, Wayne Xin Zhao, Qifei Wu, Yuchen, Ding, Hua Wu, Haifeng Wang, Ji-Rong Wen

TL;DR
This paper provides a comprehensive analysis of zero-shot dense retrieval models, exploring their capabilities, influencing factors, and biases, to better understand their performance compared to traditional methods.
Contribution
It offers the first detailed study on zero-shot dense retrieval, analyzing key factors affecting performance and comparing existing models to guide future research.
Findings
Key factors significantly influence zero-shot performance
Biases from target datasets impact retrieval effectiveness
Existing zero-shot DR models vary in effectiveness
Abstract
Recent years have witnessed the significant advance in dense retrieval (DR) based on powerful pre-trained language models (PLM). DR models have achieved excellent performance in several benchmark datasets, while they are shown to be not as competitive as traditional sparse retrieval models (e.g., BM25) in a zero-shot retrieval setting. However, in the related literature, there still lacks a detailed and comprehensive study on zero-shot retrieval. In this paper, we present the first thorough examination of the zero-shot capability of DR models. We aim to identify the key factors and analyze how they affect zero-shot retrieval performance. In particular, we discuss the effect of several key factors related to source training set, analyze the potential bias from the target dataset, and review and compare existing zero-shot DR models. Our findings provide important evidence to better…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning · Advanced Image and Video Retrieval Techniques
