Early Exit Strategies for Approximate k-NN Search in Dense Retrieval

Francesco Busolin; Claudio Lucchese; Franco Maria Nardini; Salvatore; Orlando; Raffaele Perego; Salvatore Trani

arXiv:2408.04981·cs.IR·August 12, 2024

Early Exit Strategies for Approximate k-NN Search in Dense Retrieval

Francesco Busolin, Claudio Lucchese, Franco Maria Nardini, Salvatore, Orlando, Raffaele Perego, Salvatore Trani

PDF

TL;DR

This paper introduces an unsupervised early exit strategy for approximate k-NN search in dense retrieval, significantly improving efficiency with minimal accuracy loss by adaptively deciding when to stop searching.

Contribution

It proposes a novel unsupervised patience-based early exit method and a cascade approach for efficient dense retrieval, outperforming existing strategies in speed while maintaining effectiveness.

Findings

01

Up to 5x speedup in A-kNN search efficiency.

02

Negligible loss in retrieval effectiveness.

03

Reproducible results with publicly available code.

Abstract

Learned dense representations are a popular family of techniques for encoding queries and documents using high-dimensional embeddings, which enable retrieval by performing approximate k nearest-neighbors search (A-kNN). A popular technique for making A-kNN search efficient is based on a two-level index, where the embeddings of documents are clustered offline and, at query processing, a fixed number N of clusters closest to the query is visited exhaustively to compute the result set. In this paper, we build upon state-of-the-art for early exit A-kNN and propose an unsupervised method based on the notion of patience, which can reach competitive effectiveness with large efficiency gains. Moreover, we discuss a cascade approach where we first identify queries that find their nearest neighbor within the closest t << N clusters, and then we decide how many more to visit based on our patience…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.