Calibrated Similarity for Reliable Geometric Analysis of Embedding Spaces

Nicolas Tacheny

arXiv:2601.16907·cs.LG·January 26, 2026

Calibrated Similarity for Reliable Geometric Analysis of Embedding Spaces

Nicolas Tacheny

PDF

Open Access

TL;DR

This paper introduces a monotonic calibration method using isotonic regression to improve the interpretability of cosine similarity scores in embedding spaces, maintaining ranking while correcting score miscalibration.

Contribution

It presents a calibration technique that preserves the geometric structure of embeddings and enhances interpretability without altering existing similarity rankings.

Findings

01

Achieves near-perfect calibration of similarity scores

02

Preserves rank correlation and local stability

03

Invariant under various order-based constructions

Abstract

While raw cosine similarity in pretrained embedding spaces exhibits strong rank correlation with human judgments, anisotropy induces systematic miscalibration of absolute values: scores concentrate in a narrow high-similarity band regardless of actual semantic relatedness, limiting interpretability as a quantitative measure. Prior work addresses this by modifying the embedding space (whitening, contrastive fine tuning), but such transformations alter geometric structure and require recomputing all embeddings. Using isotonic regression trained on human similarity judgments, we construct a monotonic transformation that achieves near-perfect calibration while preserving rank correlation and local stability(98% across seven perturbation types). Our contribution is not to replace cosine similarity, but to restore interpretability of its absolute values through monotone calibration, without…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Graph Neural Networks · Face Recognition and Perception · Data Visualization and Analytics