Automatic Selection of t-SNE Perplexity
Yanshuai Cao, Luyu Wang

TL;DR
This paper introduces an automatic method for selecting the perplexity hyperparameter in t-SNE, simplifying the tuning process and aligning with expert preferences across datasets.
Contribution
It proposes a model selection objective for t-SNE perplexity that requires minimal additional computation and is validated against human expert preferences.
Findings
Perplexity settings match expert preferences across datasets
The method requires negligible extra computation
Analysis relates the approach to BIC and MDL
Abstract
t-Distributed Stochastic Neighbor Embedding (t-SNE) is one of the most widely used dimensionality reduction methods for data visualization, but it has a perplexity hyperparameter that requires manual selection. In practice, proper tuning of t-SNE perplexity requires users to understand the inner working of the method as well as to have hands-on experience. We propose a model selection objective for t-SNE perplexity that requires negligible extra computation beyond that of the t-SNE itself. We empirically validate that the perplexity settings found by our approach are consistent with preferences elicited from human experts across a number of datasets. The similarities of our approach to Bayesian information criteria (BIC) and minimum description length (MDL) are also analyzed.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEvolutionary Algorithms and Applications · Algorithms and Data Compression · Cellular Automata and Applications
