Reranking with Compressed Document Representation

Herv\'e D\'ejean; St\'ephane Clinchant

arXiv:2505.15394·cs.IR·May 22, 2025

Reranking with Compressed Document Representation

Herv\'e D\'ejean, St\'ephane Clinchant

PDF

Open Access 1 Models

TL;DR

This paper introduces a novel reranking method that compresses documents into fixed-size embeddings to improve efficiency and effectiveness, especially for long documents, using distillation on large models.

Contribution

The paper proposes a new document compression technique for reranking that leverages distillation, enabling large models to efficiently rerank long documents with competitive performance.

Findings

01

Compressed reranker challenges smaller models in effectiveness.

02

The approach improves efficiency for long document reranking.

03

Early-stage text compressors show promising potential.

Abstract

Reranking, the process of refining the output of a first-stage retriever, is often considered computationally expensive, especially with Large Language Models. Borrowing from recent advances in document compression for RAG, we reduce the input size by compressing documents into fixed-size embedding representations. We then teach a reranker to use compressed inputs by distillation. Although based on a billion-size model, our trained reranker using this compressed input can challenge smaller rerankers in terms of both effectiveness and efficiency, especially for long documents. Given that text compressors are still in their early development stages, we view this approach as promising.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

🤗
naver/modernReranker
model· 4 dl· ♡ 1
4 dl♡ 1

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAlgorithms and Data Compression · Natural Language Processing Techniques

MethodsAttention Is All You Need · Linear Warmup With Linear Decay · Softmax · Attention Dropout · WordPiece · Refunds@Expedia|||How do I get a full refund from Expedia? · Linear Layer · Residual Connection · Byte Pair Encoding · Weight Decay