In Defense of Cross-Encoders for Zero-Shot Retrieval

Guilherme Rosa; Luiz Bonifacio; Vitor Jeronymo; Hugo Abonizio; and Marzieh Fadaee; Roberto Lotufo; Rodrigo Nogueira

arXiv:2212.06121·cs.IR·December 13, 2022·6 cites

In Defense of Cross-Encoders for Zero-Shot Retrieval

Guilherme Rosa, Luiz Bonifacio, Vitor Jeronymo, Hugo Abonizio, and Marzieh Fadaee, Roberto Lotufo, Rodrigo Nogueira

PDF

Open Access 1 Repo

TL;DR

This paper demonstrates that large cross-encoders significantly outperform bi-encoders in zero-shot retrieval tasks, especially in out-of-domain scenarios, due to their greater parameter count and interaction complexity.

Contribution

It provides a comprehensive analysis of how model size and architecture influence zero-shot retrieval performance, highlighting the advantages of cross-encoders over bi-encoders.

Findings

01

Cross-encoders outperform bi-encoders of similar size in multiple tasks.

02

Increasing model size yields larger gains in out-of-domain retrieval.

03

Using bi-encoders as first-stage retrievers offers no advantage over BM25 in out-of-domain settings.

Abstract

Bi-encoders and cross-encoders are widely used in many state-of-the-art retrieval pipelines. In this work we study the generalization ability of these two types of architectures on a wide range of parameter count on both in-domain and out-of-domain scenarios. We find that the number of parameters and early query-document interactions of cross-encoders play a significant role in the generalization ability of retrieval models. Our experiments show that increasing model size results in marginal gains on in-domain test sets, but much larger gains in new domains never seen during fine-tuning. Furthermore, we show that cross-encoders largely outperform bi-encoders of similar size in several tasks. In the BEIR benchmark, our largest cross-encoder surpasses a state-of-the-art bi-encoder by more than 4 average points. Finally, we show that using bi-encoders as first-stage retrievers provides no…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

guilhermemr04/scaling-zero-shot-retrieval
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications · Topic Modeling

MethodsTest