Efficient Long-Document Reranking via Block-Level Embeddings and Top-k Interaction Refinement
Minghan Li, Eric Gaussier, Guodong Zhou

TL;DR
This paper introduces an efficient long-document reranking method using block-level embeddings and a lightweight interaction refinement, significantly improving relevance scoring while maintaining low latency and interpretability.
Contribution
It proposes a novel block-level embedding framework combined with Top-k Interaction Refinement for effective and efficient long-document reranking.
Findings
Block embeddings outperform single-vector encodings.
TIR consistently improves reranking performance.
Method maintains low latency suitable for practical use.
Abstract
Dense encoders and LLM-based rerankers struggle with long documents: single-vector representations dilute fine-grained relevance, while cross-encoders are often too expensive for practical reranking. We present an efficient long-document reranking framework based on block-level embeddings. Each document is segmented into short blocks and encoded into block embeddings that can be precomputed offline. Given a query, we encode it once and score each candidate document by aggregating top-k query-block similarities with a simple weighted sum, yielding a strong and interpretable block-level relevance signal. To capture dependencies among the selected blocks and suppress redundancy, we introduce Top-k Interaction Refinement (TIR), a lightweight setwise module that applies query-conditioned attention over the top-k blocks and produces a bounded residual correction to block scores. TIR…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Semantic Web and Ontologies
