Training for Compositional Sensitivity Reduces Dense Retrieval Generalization
Radoslav Ralev, Aditeya Baral, Iliya Zhechev, Jen Agarwal, Srijith Rajamohan

TL;DR
This paper investigates how training for compositional sensitivity in dense retrieval models improves their ability to generalize in zero-shot settings by reducing brittleness to minimal text edits.
Contribution
It demonstrates that incorporating structure-targeted negatives enhances zero-shot retrieval performance and introduces a Transformer-based verifier for better structural near-miss rejection.
Findings
Adding structure-targeted negatives reduces zero-shot retrieval drop by 8-9% on small backbones
MaxSim excels at reranking but struggles with near-misses; Transformer verifier improves separation
Partial improvement in pooled cosine space separation observed
Abstract
Dense retrieval compresses texts into single embeddings ranked by cosine similarity. While efficient for recall, this interface is brittle for identity-level matching: minimal compositional edits (negation, role swaps) flip meaning yet retain high similarity. Motivated by geometric results for unit-sphere cosine spaces (Kang et al., 2025), we test this retrieval-composition tension in text-only retrieval. Across four dual-encoder backbones, adding structure-targeted negatives consistently reduces zero-shot NanoBEIR retrieval (8-9% mean nDCG@10 drop on small backbones; up to 40% on medium ones), while only partially improving pooled-space separation. Treating pooled cosine as a recall interface, we then benchmark verifiers scoring token--token cosine maps. MaxSim (late interaction) excels at reranking but fails to reject structural near-misses, whereas a small Transformer over similarity…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
