Working Notes on Late Interaction Dynamics: Analyzing Targeted Behaviors of Late Interaction Models
Antoine Edy, Max Conti, Quentin Mac\'e

TL;DR
This paper investigates the internal behaviors of Late Interaction retrieval models, focusing on length bias and similarity distribution, to understand their performance and potential bottlenecks.
Contribution
It provides an analysis of length bias and similarity patterns in state-of-the-art Late Interaction models, highlighting practical implications and validating the MaxSim operator’s effectiveness.
Findings
Length bias in causal models is confirmed in practice.
Bi-directional models can also suffer from length bias in extreme cases.
MaxSim operator effectively exploits token-level similarity scores.
Abstract
While Late Interaction models exhibit strong retrieval performance, many of their underlying dynamics remain understudied, potentially hiding performance bottlenecks. In this work, we focus on two topics in Late Interaction retrieval: a length bias that arises when using multi-vector scoring, and the similarity distribution beyond the best scores pooled by the MaxSim operator. We analyze these behaviors for state-of-the-art models on the NanoBEIR benchmark. Results show that while the theoretical length bias of causal Late Interaction models holds in practice, bi-directional models can also suffer from it in extreme cases. We also note that no significant similarity trend lies beyond the top-1 document token, validating that the MaxSim operator efficiently exploits the token-level similarity scores.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
