
TL;DR
This paper introduces recos, a new similarity metric derived from a tighter mathematical bound than cosine similarity, which better captures complex semantic relationships and outperforms traditional methods in various embedding models.
Contribution
The paper derives a new similarity measure, recos, based on a tighter bound than Cauchy-Schwarz, and demonstrates its superior performance across multiple embedding models and benchmarks.
Findings
recos outperforms cosine similarity in semantic tasks
recos achieves higher correlation with human judgments
recos captures broader semantic relationships
Abstract
Cosine similarity, the standard metric for measuring semantic similarity in vector spaces, is mathematically grounded in the Cauchy-Schwarz inequality, which inherently limits it to capturing linear relationships--a constraint that fails to model the complex, nonlinear structures of real-world semantic spaces. We advance this theoretical underpinning by deriving a tighter upper bound for the dot product than the classical Cauchy-Schwarz bound. This new bound leads directly to recos, a similarity metric that normalizes the dot product by the sorted vector components. recos relaxes the condition for perfect similarity from strict linear dependence to ordinal concordance, thereby capturing a broader class of relationships. Extensive experiments across 11 embedding models--spanning static, contextualized, and universal types--demonstrate that recos consistently outperforms traditional…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Advanced Graph Neural Networks · Machine Learning in Healthcare
