Reranking with Compressed Document Representation
Herv\'e D\'ejean, St\'ephane Clinchant

TL;DR
This paper introduces a novel reranking method that compresses documents into fixed-size embeddings to improve efficiency and effectiveness, especially for long documents, using distillation on large models.
Contribution
The paper proposes a new document compression technique for reranking that leverages distillation, enabling large models to efficiently rerank long documents with competitive performance.
Findings
Compressed reranker challenges smaller models in effectiveness.
The approach improves efficiency for long document reranking.
Early-stage text compressors show promising potential.
Abstract
Reranking, the process of refining the output of a first-stage retriever, is often considered computationally expensive, especially with Large Language Models. Borrowing from recent advances in document compression for RAG, we reduce the input size by compressing documents into fixed-size embedding representations. We then teach a reranker to use compressed inputs by distillation. Although based on a billion-size model, our trained reranker using this compressed input can challenge smaller rerankers in terms of both effectiveness and efficiency, especially for long documents. Given that text compressors are still in their early development stages, we view this approach as promising.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAlgorithms and Data Compression · Natural Language Processing Techniques
MethodsAttention Is All You Need · Linear Warmup With Linear Decay · Softmax · Attention Dropout · WordPiece · Refunds@Expedia|||How do I get a full refund from Expedia? · Linear Layer · Residual Connection · Byte Pair Encoding · Weight Decay
