Towards Consistency Filtering-Free Unsupervised Learning for Dense   Retrieval

Haoxiang Shi; Sumio Fujita; Tetsuya Sakai

arXiv:2308.02926·cs.IR·August 8, 2023

Towards Consistency Filtering-Free Unsupervised Learning for Dense Retrieval

Haoxiang Shi, Sumio Fujita, Tetsuya Sakai

PDF

Open Access

TL;DR

This paper proposes a more efficient, filtering-free unsupervised learning approach for dense retrieval that maintains or improves performance by replacing costly consistency filters with pseudo-labeling and keyword methods.

Contribution

It introduces a novel filtering-free unsupervised training paradigm for dense retrieval, eliminating the need for expensive consistency filtering and enhancing efficiency.

Findings

01

Pseudo relevance feedback with TextRank outperforms other methods.

02

Filtering-free approach improves training and inference efficiency.

03

In some datasets, it even enhances retrieval performance.

Abstract

Domain transfer is a prevalent challenge in modern neural Information Retrieval (IR). To overcome this problem, previous research has utilized domain-specific manual annotations and synthetic data produced by consistency filtering to finetune a general ranker and produce a domain-specific ranker. However, training such consistency filters are computationally expensive, which significantly reduces the model efficiency. In addition, consistency filtering often struggles to identify retrieval intentions and recognize query and corpus distributions in a target domain. In this study, we evaluate a more efficient solution: replacing the consistency filter with either direct pseudo-labeling, pseudo-relevance feedback, or unsupervised keyword generation methods for achieving consistent filtering-free unsupervised dense retrieval. Our extensive experimental evaluations demonstrate that, on…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Topic Modeling · Advanced Graph Neural Networks