Efficient Nearest Neighbor Search for Cross-Encoder Models using Matrix Factorization
Nishant Yadav, Nicholas Monath, Rico Angell, Manzil Zaheer, Andrew, McCallum

TL;DR
This paper introduces a matrix factorization method using CUR decomposition to efficiently perform cross-encoder based nearest neighbor search, surpassing existing dual-encoder reranking methods in recall and computational cost.
Contribution
The authors propose a novel retrieval approach that directly uses cross-encoders with matrix factorization, eliminating the need for auxiliary models and improving efficiency and accuracy.
Findings
Outperforms dual-encoder reranking in recall for k > 10
Reduces computational cost compared to training dual-encoder models
Provides better recall-cost trade-offs in empirical tests
Abstract
Efficient k-nearest neighbor search is a fundamental task, foundational for many problems in NLP. When the similarity is measured by dot-product between dual-encoder vectors or -distance, there already exist many scalable and efficient search methods. But not so when similarity is measured by more accurate and expensive black-box neural similarity models, such as cross-encoders, which jointly encode the query and candidate neighbor. The cross-encoders' high computational cost typically limits their use to reranking candidates retrieved by a cheaper model, such as dual encoder or TF-IDF. However, the accuracy of such a two-stage approach is upper-bounded by the recall of the initial candidate set, and potentially requires additional training to align the auxiliary retrieval model with the cross-encoder model. In this paper, we present an approach that avoids the use of a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Domain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications
MethodsALIGN
