Scaling Sparse and Dense Retrieval in Decoder-Only LLMs

Hansi Zeng; Julian Killingback; Hamed Zamani

arXiv:2502.15526·cs.IR·February 24, 2025

Scaling Sparse and Dense Retrieval in Decoder-Only LLMs

Hansi Zeng, Julian Killingback, Hamed Zamani

PDF

2 Repos

TL;DR

This paper systematically compares sparse and dense retrieval methods in decoder-only LLMs, revealing that sparse retrieval scales better and achieves state-of-the-art results when combined with contrastive and knowledge distillation training.

Contribution

It provides the first comprehensive analysis of how different retrieval paradigms and training objectives scale in decoder-only LLMs, highlighting the superiority of sparse retrieval at larger scales.

Findings

01

Sparse retrieval outperforms dense retrieval across benchmarks.

02

Scaling benefits are significant only with contrastive learning.

03

Combining CL and KD at 8B scale yields state-of-the-art results.

Abstract

Scaling large language models (LLMs) has shown great potential for improving retrieval model performance; however, previous studies have mainly focused on dense retrieval trained with contrastive loss (CL), neglecting the scaling behavior of other retrieval paradigms and optimization techniques, such as sparse retrieval and knowledge distillation (KD). In this work, we conduct a systematic comparative study on how different retrieval paradigms (sparse vs. dense) and fine-tuning objectives (CL vs. KD vs. their combination) affect retrieval performance across different model scales. Using MSMARCO passages as the training dataset, decoder-only LLMs (Llama-3 series: 1B, 3B, 8B), and a fixed compute budget, we evaluate various training configurations on both in-domain (MSMARCO, TREC DL) and out-of-domain (BEIR) benchmarks. Our key findings reveal that: (1) Scaling behaviors emerge clearly…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsKnowledge Distillation