TL;DR
This paper introduces the Fast-Forward index, a simple and efficient vector index that combines lexical and semantic scores for fast neural document ranking on CPUs, improving both speed and performance.
Contribution
The paper presents a novel sparse index structure that enables fast CPU-based neural ranking by interpolating lexical and semantic scores, with index pruning and early stopping techniques.
Findings
Significant speedup in query processing on TREC-DL datasets.
Improved ranking performance over hybrid indexes.
Efficient retrieval using only CPU resources.
Abstract
Neural document ranking approaches, specifically transformer models, have achieved impressive gains in ranking performance. However, query processing using such over-parameterized models is both resource and time intensive. In this paper, we propose the Fast-Forward index -- a simple vector forward index that facilitates ranking documents using interpolation of lexical and semantic scores -- as a replacement for contextual re-rankers and dense indexes based on nearest neighbor search. Fast-Forward indexes rely on efficient sparse models for retrieval and merely look up pre-computed dense transformer-based vector representations of documents and passages in constant time for fast CPU-based semantic similarity computation during query processing. We propose index pruning and theoretically grounded early stopping techniques to improve the query processing throughput. We conduct extensive…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsPruning · Early Stopping
