CODER: An efficient framework for improving retrieval through COntextual Document Embedding Reranking
George Zerveas, Navid Rekabsaz, Daniel Cohen, Carsten Eickhoff

TL;DR
CODER introduces a fast, context-aware reranking framework that enhances dense retrieval models by leveraging query-specific negatives and list-wise loss, significantly improving performance on benchmark datasets.
Contribution
The paper presents CODER, a novel, efficient reranking framework that incorporates contextual information into dense retrieval, achieving state-of-the-art results with minimal computational overhead.
Findings
Significant performance improvements on MS MARCO and TripClick datasets.
Enhanced results with increased relevance information per query.
CODER can be used as a standalone retriever after fine-tuning.
Abstract
Contrastive learning has been the dominant approach to training dense retrieval models. In this work, we investigate the impact of ranking context - an often overlooked aspect of learning dense retrieval models. In particular, we examine the effect of its constituent parts: jointly scoring a large number of negatives per query, using retrieved (query-specific) instead of random negatives, and a fully list-wise loss. To incorporate these factors into training, we introduce Contextual Document Embedding Reranking (CODER), a highly efficient retrieval framework. When reranking, it incurs only a negligible computational overhead on top of a first-stage method at run time (delay per query in the order of milliseconds), allowing it to be easily combined with any state-of-the-art dual encoder method. After fine-tuning through CODER, which is a lightweight and fast process, models can also be…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications · Topic Modeling
MethodsAttentive Walk-Aggregating Graph Neural Network · Balanced Selection
