Effective Inference-Free Retrieval for Learned Sparse Representations
Franco Maria Nardini, Thong Nguyen, Cosimo Rulli, Rossano Venturini,, and Andrew Yates

TL;DR
This paper introduces Li-LSR, an inference-free retrieval method that improves learned sparse retrieval by replacing query encoding with a token scoring table, achieving state-of-the-art results.
Contribution
The paper proposes Li-LSR, a novel inference-free approach for learned sparse retrieval that enhances efficiency and effectiveness over existing models.
Findings
Li-LSR surpasses Splade-v3-Doc by 1 point mRR@10 on MS MARCO.
Li-LSR achieves 1.8 points higher nDCG@10 on BEIR.
Regularization can be relaxed to improve LSR encoder effectiveness.
Abstract
Learned Sparse Retrieval (LSR) is an effective IR approach that exploits pre-trained language models for encoding text into a learned bag of words. Several efforts in the literature have shown that sparsity is key to enabling a good trade-off between the efficiency and effectiveness of the query processor. To induce the right degree of sparsity, researchers typically use regularization techniques when training LSR models. Recently, new efficient -- inverted index-based -- retrieval engines have been proposed, leading to a natural question: has the role of regularization changed in training LSR models? In this paper, we conduct an extended evaluation of regularization approaches for LSR where we discuss their effectiveness, efficiency, and out-of-domain generalization capabilities. We first show that regularization can be relaxed to produce more effective LSR encoders. We also show that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
