Stochastic neighborhood embedding and the gradient flow of relative entropy
Ben Weinkove

TL;DR
This paper analyzes the mathematical foundations of stochastic neighborhood embedding (SNE) and t-SNE, focusing on the gradient flow of relative entropy and its long-term behavior, including bounds on the diameter of the embedded data.
Contribution
It provides a rigorous mathematical analysis of the gradient flow underlying SNE and t-SNE, revealing differences in their long-term behavior and bounds on the embedding diameter.
Findings
Diameter remains bounded for SNE
Diameter may blow up for t-SNE
Provides bounds for the long-time behavior of the embedding process
Abstract
Dimension reduction, widely used in science, maps high-dimensional data into low-dimensional space. We investigate a basic mathematical model underlying the techniques of stochastic neighborhood embedding (SNE) and its popular variant t-SNE. Distances between points in high dimensions are used to define a probability distribution on pairs of points, measuring how similar the points are. The aim is to map these points to low dimensions in an optimal way so that similar points are closer together. This is carried out by minimizing the relative entropy between two probability distributions. We consider the gradient flow of the relative entropy and analyze its long-time behavior. This is a self-contained mathematical problem about the behavior of a system of nonlinear ordinary differential equations. We find optimal bounds for the diameter of the evolving sets as time tends to infinity.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Numerical Analysis Techniques
