Efficient Listwise Reranking with Compressed Document Representations

Herv\'e D\'ejean; St\'ephane Clinchant

arXiv:2604.26483·cs.IR·April 30, 2026

Efficient Listwise Reranking with Compressed Document Representations

Herv\'e D\'ejean, St\'ephane Clinchant

PDF

TL;DR

This paper introduces RRK, a listwise reranker that compresses documents into fixed-size embeddings, enabling faster and more effective reranking, especially for long documents, with minimal performance loss.

Contribution

The paper proposes a novel document compression method for reranking that improves efficiency and effectiveness, particularly on long-document benchmarks, using a simple distillation training approach.

Findings

01

RRK runs 3x-18x faster than smaller rerankers.

02

RRK matches or outperforms smaller rerankers in effectiveness.

03

Efficiency gains are greater on long-document benchmarks.

Abstract

Reranking, the process of refining the output from a first-stage retriever, is often considered computationally expensive, especially when using Large Language Models (LLMs). A common approach to mitigate this cost involves utilizing smaller LLMs or controlling input length. Inspired by recent advances in document compression for retrieval-augmented generation (RAG), we introduce RRK, an efficient and effective listwise reranker compressing documents into multi-token fixed-size embedding representations. Our simple training via distillation shows that this combination of rich compressed representations and listwise reranking yields a highly efficient and effective system. In particular, our 8B-parameter model runs 3x-18x faster than smaller rerankers (0.6-4B parameters) while matching or outperforming them in effectiveness. The efficiency gains are even more striking on long-document…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.