In Defense of Cosine Similarity: Normalization Eliminates the Gauge Freedom

Taha Bouhsine

arXiv:2602.19393·cs.LG·February 24, 2026

In Defense of Cosine Similarity: Normalization Eliminates the Gauge Freedom

Taha Bouhsine

PDF

Open Access

TL;DR

This paper defends the use of cosine similarity for embeddings by proving that normalization removes gauge ambiguity, making cosine and Euclidean distances equivalent on the unit sphere, contrary to prior claims.

Contribution

It demonstrates that when embeddings are normalized, the gauge ambiguity in cosine similarity disappears, establishing their geometric equivalence with Euclidean distance.

Findings

01

Gauge ambiguity vanishes on the unit sphere with normalization.

02

Cosine similarity is equivalent to half the squared Euclidean distance on normalized embeddings.

03

Normalization resolves issues previously attributed to cosine similarity.

Abstract

Steck, Ekanadham, and Kallus [arXiv:2403.05440] demonstrate that cosine similarity of learned embeddings from matrix factorization models can be rendered arbitrary by a diagonal ``gauge'' matrix $D$ . Their result is correct and important for practitioners who compute cosine similarity on embeddings trained with dot-product objectives. However, we argue that their conclusion, cautioning against cosine similarity in general, conflates the pathology of an incompatible training objective with the geometric validity of cosine distance on the unit sphere. We prove that when embeddings are constrained to the unit sphere $S^{d - 1}$ (either during or after training with an appropriate objective), the $D$ -matrix ambiguity vanishes identically, and cosine distance reduces to exactly half the squared Euclidean distance. This monotonic equivalence implies that cosine-based and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Child and Animal Learning Development · Advanced Graph Neural Networks