A Static Pruning Study on Sparse Neural Retrievers
Carlos Lassance, Simon Lupart, Herv\'e Dejean, St\'ephane Clinchant,, Nicola Tonellotto

TL;DR
This paper investigates the application of static pruning techniques to sparse neural retrievers, demonstrating significant speedups with minimal impact on retrieval effectiveness across various datasets.
Contribution
It is the first comprehensive study showing that static pruning strategies can effectively accelerate sparse neural retrievers with negligible accuracy loss.
Findings
Static pruning achieves up to 4x speedup.
Negligible effectiveness loss (≤2%) with certain pruning strategies.
Neural rerankers remain robust to pruned candidate sets.
Abstract
Sparse neural retrievers, such as DeepImpact, uniCOIL and SPLADE, have been introduced recently as an efficient and effective way to perform retrieval with inverted indexes. They aim to learn term importance and, in some cases, document expansions, to provide a more effective document ranking compared to traditional bag-of-words retrieval models such as BM25. However, these sparse neural retrievers have been shown to increase the computational costs and latency of query processing compared to their classical counterparts. To mitigate this, we apply a well-known family of techniques for boosting the efficiency of query processing over inverted indexes: static pruning. We experiment with three static pruning strategies, namely document-centric, term-centric and agnostic pruning, and we assess, over diverse datasets, that these techniques still work with sparse neural retrievers. In…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Domain Adaptation and Few-Shot Learning · Neural Networks and Applications
MethodsPruning
