Doubly Stochastic Neighbor Embedding on Spheres
Yao Lu, Jukka Corander, Zhirong Yang

TL;DR
This paper introduces DOSNES, a spherical embedding method that normalizes similarity matrices to be doubly stochastic, effectively resolving crowding issues in data visualization by leveraging spherical geometry.
Contribution
The paper proposes a novel normalization technique for SNE that enforces doubly stochasticity and demonstrates that spherical embeddings improve visualization quality.
Findings
DOSNES outperforms existing SNE methods in visualization clarity.
Spherical embeddings mitigate crowding in high-dimensional data visualization.
Theoretical analysis supports the effectiveness of doubly stochastic normalization.
Abstract
Stochastic Neighbor Embedding (SNE) methods minimize the divergence between the similarity matrix of a high-dimensional data set and its counterpart from a low-dimensional embedding, leading to widely applied tools for data visualization. Despite their popularity, the current SNE methods experience a crowding problem when the data include highly imbalanced similarities. This implies that the data points with higher total similarity tend to get crowded around the display center. To solve this problem, we introduce a fast normalization method and normalize the similarity matrix to be doubly stochastic such that all the data points have equal total similarities. Furthermore, we show empirically and theoretically that the doubly stochasticity constraint often leads to embeddings which are approximately spherical. This suggests replacing a flat space with spheres as the embedding space. The…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Visualization and Analytics · Complex Network Analysis Techniques · Advanced Clustering Algorithms Research
