Spike Hijacking in Late-Interaction Retrieval

Karthik Suresh; Tushar Vatsa; Tracy King; Asim Kadav; Michael Friedrich

arXiv:2604.05253·cs.IR·April 8, 2026

Spike Hijacking in Late-Interaction Retrieval

Karthik Suresh, Tushar Vatsa, Tracy King, Asim Kadav, Michael Friedrich

PDF

TL;DR

This paper analyzes how MaxSim pooling in late-interaction retrieval models causes gradient concentration and sensitivity to document length, revealing a tradeoff between discrimination and robustness.

Contribution

It provides a mechanistic study of gradient routing in MaxSim, demonstrating its biases and proposing the need for principled pooling alternatives.

Findings

01

MaxSim induces higher gradient concentration than smoothing methods.

02

Sparse routing improves early discrimination but increases length sensitivity.

03

MaxSim degrades more sharply with document length than smoothing variants.

Abstract

Late-interaction retrieval models rely on hard maximum similarity (MaxSim) to aggregate token-level similarities. Although effective, this winner-take-all pooling rule may structurally bias training dynamics. We provide a mechanistic study of gradient routing and robustness in MaxSim-based retrieval. In a controlled synthetic environment with in-batch contrastive training, we demonstrate that MaxSim induces significantly higher patch-level gradient concentration than smoother alternatives such as Top-k pooling and softmax aggregation. While sparse routing can improve early discrimination, it also increases sensitivity to document length: as the number of document patches grows, MaxSim degrades more sharply than mild smoothing variants. We corroborate these findings on a real-world multi-vector retrieval benchmark, where controlled document-length sweeps reveal similar brittleness under…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.