When Does Embedding Magnitude Matter? A Cross-Task Functional-Symmetry Framework

Xincan Feng; Taro Watanabe

arXiv:2602.09229·cs.LG·May 11, 2026

When Does Embedding Magnitude Matter? A Cross-Task Functional-Symmetry Framework

Xincan Feng, Taro Watanabe

PDF

TL;DR

This paper introduces a new 2x2 normalization framework for embedding similarity measures, demonstrating that selective normalization improves retrieval and other task performances across various domains.

Contribution

The paper proposes a novel framework controlling query and document normalization independently, revealing new variants that outperform traditional cosine and dot product measures.

Findings

01

Unilateral variants outperform cosine and dot product in retrieval tasks.

02

Document magnitude influences inference scores, query magnitude affects training gradients.

03

Task functional symmetry predicts the effectiveness of normalization variants across diverse tasks.

Abstract

Cosine similarity normalizes both sides; dot product normalizes neither. We propose a 2x2 framework that independently controls query-side and document-side normalization, exposing two intermediate variants (QNorm, DNorm) that have not been previously studied. On retrieval with four encoders, evaluated in-domain on MS MARCO and out-of-domain on BEIR, BRIGHT, and multi-hop QA, the unilateral variants outperform both cosine and dot product, with relative gains of up to +72% out-of-domain and +24% on downstream RAG. Cross-evaluation reveals the mechanism: document magnitude scales inference scores while query magnitude modulates training gradients, and the Fisher Information Matrix condition number predicts which side to normalize. We then classify tasks by functional symmetry, defined as whether the aggregate scoring procedure treats Q and C as interchangeable, and test whether the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.