An Early FIRST Reproduction and Improvements to Single-Token Decoding for Fast Listwise Reranking
Zijian Chen, Ronak Pradeep, Jimmy Lin

TL;DR
This paper introduces FIRST, a fast listwise reranking method using only the first token logits of large language models, significantly reducing inference latency while maintaining high reranking quality across multiple datasets.
Contribution
The study extends FIRST to new datasets, evaluates its robustness, explores the impact of different retrievers, and demonstrates effectiveness improvements by applying the method to various backbone models.
Findings
Latency reduced by 21%-42% across models
Maintains reranking quality out-of-domain
Effectiveness surpasses original implementation
Abstract
Recent advances have demonstrated that large language models (LLMs) excel as listwise rerankers, but their high computational demands remain a barrier to widespread adoption. Further, the traditional language modeling (LM) objective is not ideally suited for reranking tasks. FIRST is a novel approach that addresses these challenges by integrating a learning-to-rank objective and leveraging the logits of only the first generated token, thereby significantly reducing inference latency compared to traditional LLM rerankers. In this study, we extend the evaluation of FIRST to the TREC Deep Learning datasets (DL19-22), validating its robustness across diverse domains. We investigate the influence of different first-stage retrievers on FIRST rerankers, observing diminishing returns and patterns consistent with traditional LLM rerankers. Through applying the FIRST objective to a broader range…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAlgorithms and Data Compression · Parallel Computing and Optimization Techniques · Embedded Systems Design Techniques
