Sampling-enabled scalable manifold learning unveils the discriminative cluster structure of high-dimensional data
Dehua Peng, Zhipeng Gui, Wenzhang Wei, Fa Li, Jie Gui, Huayi Wu, Jianya Gong

TL;DR
This paper introduces SUDE, a scalable manifold learning method that uses sampling and landmark-based embedding to effectively analyze large-scale, high-dimensional data while preserving cluster structures and global geometry.
Contribution
The paper proposes SUDE, a novel sampling-based manifold learning technique that improves scalability and discriminative embedding for large and high-dimensional datasets.
Findings
SUDE outperforms existing methods in scalability and cluster separation.
It maintains embedding quality even with reduced sampling rates.
SUDE effectively analyzes single-cell and ECG data for pattern detection.
Abstract
As a pivotal branch of machine learning, manifold learning uncovers the intrinsic low-dimensional structure within complex nonlinear manifolds in high-dimensional space for visualization, classification, clustering, and gaining key insights. Although existing techniques have achieved remarkable successes, they suffer from extensive distortions of cluster structure, which hinders the understanding of underlying patterns. Scalability issues also limit their applicability for handling large-scale data. We hence propose a sampling-based Scalable manifold learning technique that enables Uniform and Discriminative Embedding, namely SUDE, for large-scale and high-dimensional data. It starts by seeking a set of landmarks to construct the low-dimensional skeleton of the entire data, and then incorporates the non-landmarks into the learned space based on the constrained locally linear embedding…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSingle-cell and spatial transcriptomics
MethodsSparse Evolutionary Training
