Supervised Manifold Learning via Random Forest Geometry-Preserving Proximities
Jake S. Rhodes

TL;DR
This paper introduces a supervised manifold learning method leveraging random forest proximities to better preserve data geometry, improving low-dimensional embeddings for high-dimensional data.
Contribution
It proposes a novel kernel based on random forest proximities for supervised manifold learning, addressing limitations of class-conditional distances in existing methods.
Findings
Proximity-based kernels improve local structure preservation.
Diffusion algorithms maintain global structure with the new kernel.
Supervised embeddings outperform traditional class-conditional methods.
Abstract
Manifold learning approaches seek the intrinsic, low-dimensional data structure within a high-dimensional space. Mainstream manifold learning algorithms, such as Isomap, UMAP, -SNE, Diffusion Map, and Laplacian Eigenmaps do not use data labels and are thus considered unsupervised. Existing supervised extensions of these methods are limited to classification problems and fall short of uncovering meaningful embeddings due to their construction using order non-preserving, class-conditional distances. In this paper, we show the weaknesses of class-conditional manifold learning quantitatively and visually and propose an alternate choice of kernel for supervised dimensionality reduction using a data-geometry-preserving variant of random forest proximities as an initialization for manifold learning methods. We show that local structure preservation using these proximities is near universal…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsLandslides and related hazards · Face and Expression Recognition
MethodsDiffusion
