GhostUMAP2: Measuring and Analyzing (r,d)-Stability of UMAP
Myeongwon Jung, Takanori Fujiwara, and Jaemin Jo

TL;DR
This paper introduces (r,d)-stability for UMAP, a framework to analyze how stochastic elements affect the stability of data point projections, with efficient computation and visualization tools.
Contribution
We propose a novel (r,d)-stability framework for UMAP that quantifies stochastic projection stability and includes efficient algorithms and visualization tools.
Findings
Up to 60% runtime reduction with maintained accuracy.
Approximately 90% of unstable points are identified.
Demonstrated stability analysis on real-world datasets.
Abstract
Despite the widespread use of Uniform Manifold Approximation and Projection (UMAP), the impact of its stochastic optimization process on the results remains underexplored. We observed that it often produces unstable results where the projections of data points are determined mostly by chance rather than reflecting neighboring structures. To address this limitation, we introduce (r,d)-stability to UMAP: a framework that analyzes the stochastic positioning of data points in the projection space. To assess how stochastic elements, specifically initial projection positions and negative sampling, impact UMAP results, we introduce "ghosts", or duplicates of data points representing potential positional variations due to stochasticity. We define a data point's projection as (r,d)-stable if its ghosts perturbed within a circle of radius r in the initial projection remain confined within a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNetwork Traffic and Congestion Control · Wireless Networks and Protocols · Mobile Agent-Based Network Management
