From Topology to Retrieval: Decoding Embedding Spaces with Unified Signatures
Florian Rottach, William Rudman, Bastian Rieck, Harrisen Scells, Carsten Eickhoff

TL;DR
This paper introduces Unified Topological Signatures (UTS), a comprehensive framework for analyzing and characterizing the structure of text embedding spaces, improving interpretability and retrieval performance understanding.
Contribution
The paper proposes UTS, a novel holistic framework that captures the topological structure of embedding spaces and predicts model properties and retrieval effectiveness.
Findings
High redundancy among topological measures.
Single metrics often insufficient to differentiate embeddings.
UTS effectively predicts retrievability and model similarities.
Abstract
Studying how embeddings are organized in space not only enhances model interpretability but also uncovers factors that drive downstream task performance. In this paper, we present a comprehensive analysis of topological and geometric measures across a wide set of text embedding models and datasets. We find a high degree of redundancy among these measures and observe that individual metrics often fail to sufficiently differentiate embedding spaces. Building on these insights, we introduce Unified Topological Signatures (UTS), a holistic framework for characterizing embedding spaces. We show that UTS can predict model-specific properties and reveal similarities driven by model architecture. Further, we demonstrate the utility of our method by linking topological structure to ranking effectiveness and accurately predicting document retrievability. We find that a holistic, multi-attribute…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Graph Neural Networks · Topological and Geometric Data Analysis · Data Visualization and Analytics
