Enhancing Supervised Visualization through Autoencoder and Random Forest Proximities for Out-of-Sample Extension
Shuang Ni, Adrien Aumon, Guy Wolf, Kevin R. Moon, Jake S. Rhodes

TL;DR
This paper introduces a semi-supervised out-of-sample extension method for RF-PHATE, combining autoencoders and random forest proximities to improve embedding robustness and reduce training time.
Contribution
It presents a novel out-of-sample extension technique that integrates autoencoders with random forest proximities, enhancing robustness and efficiency in supervised dimensionality reduction.
Findings
Autoencoders reconstructing RF proximities are more robust for embedding extension.
Proximity-based prototypes reduce training time by 40%.
Method achieves consistent quality with only 10% of training data.
Abstract
The value of supervised dimensionality reduction lies in its ability to uncover meaningful connections between data features and labels. Common dimensionality reduction methods embed a set of fixed, latent points, but are not capable of generalizing to an unseen test set. In this paper, we provide an out-of-sample extension method for the random forest-based supervised dimensionality reduction method, RF-PHATE, combining information learned from the random forest model with the function-learning capabilities of autoencoders. Through quantitative assessment of various autoencoder architectures, we identify that networks that reconstruct random forest proximities are more robust for the embedding extension problem. Furthermore, by leveraging proximity-based prototypes, we achieve a 40% reduction in training time without compromising extension quality. Our method does not require label…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · AI in cancer detection · Machine Learning and Data Classification
MethodsSparse Evolutionary Training
