Towards Consistency Filtering-Free Unsupervised Learning for Dense Retrieval
Haoxiang Shi, Sumio Fujita, Tetsuya Sakai

TL;DR
This paper proposes a more efficient, filtering-free unsupervised learning approach for dense retrieval that maintains or improves performance by replacing costly consistency filters with pseudo-labeling and keyword methods.
Contribution
It introduces a novel filtering-free unsupervised training paradigm for dense retrieval, eliminating the need for expensive consistency filtering and enhancing efficiency.
Findings
Pseudo relevance feedback with TextRank outperforms other methods.
Filtering-free approach improves training and inference efficiency.
In some datasets, it even enhances retrieval performance.
Abstract
Domain transfer is a prevalent challenge in modern neural Information Retrieval (IR). To overcome this problem, previous research has utilized domain-specific manual annotations and synthetic data produced by consistency filtering to finetune a general ranker and produce a domain-specific ranker. However, training such consistency filters are computationally expensive, which significantly reduces the model efficiency. In addition, consistency filtering often struggles to identify retrieval intentions and recognize query and corpus distributions in a target domain. In this study, we evaluate a more efficient solution: replacing the consistency filter with either direct pseudo-labeling, pseudo-relevance feedback, or unsupervised keyword generation methods for achieving consistent filtering-free unsupervised dense retrieval. Our extensive experimental evaluations demonstrate that, on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Topic Modeling · Advanced Graph Neural Networks
