Improving Out-of-Distribution Generalization of Neural Rerankers with Contextualized Late Interaction
Xinyu Zhang, Minghan Li, and Jimmy Lin

TL;DR
This paper demonstrates that incorporating a simple late interaction mechanism into neural rerankers significantly improves their out-of-distribution generalization with minimal latency increase, especially for longer queries.
Contribution
It introduces the use of contextualized late interaction in neural rerankers, showing consistent out-of-distribution performance gains across various models and retrievers.
Findings
Late interaction adds 5% average improvement on out-of-distribution datasets.
The method maintains in-domain effectiveness with little latency increase.
Longer queries benefit more from the late interaction approach.
Abstract
Recent progress in information retrieval finds that embedding query and document representation into multi-vector yields a robust bi-encoder retriever on out-of-distribution datasets. In this paper, we explore whether late interaction, the simplest form of multi-vector, is also helpful to neural rerankers that only use the [CLS] vector to compute the similarity score. Although intuitively, the attention mechanism of rerankers at the previous layers already gathers the token-level information, we find adding late interaction still brings an extra 5% improvement in average on out-of-distribution datasets, with little increase in latency and no degradation in in-domain effectiveness. Through extensive experiments and analysis, we show that the finding is consistent across different model sizes and first-stage retrievers of diverse natures and that the improvement is more prominent on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Advanced Neural Network Applications · Topic Modeling
