Automatic Selection of t-SNE Perplexity

Yanshuai Cao; Luyu Wang

arXiv:1708.03229·cs.AI·August 11, 2017·38 cites

Automatic Selection of t-SNE Perplexity

Yanshuai Cao, Luyu Wang

PDF

Open Access

TL;DR

This paper introduces an automatic method for selecting the perplexity hyperparameter in t-SNE, simplifying the tuning process and aligning with expert preferences across datasets.

Contribution

It proposes a model selection objective for t-SNE perplexity that requires minimal additional computation and is validated against human expert preferences.

Findings

01

Perplexity settings match expert preferences across datasets

02

The method requires negligible extra computation

03

Analysis relates the approach to BIC and MDL

Abstract

t-Distributed Stochastic Neighbor Embedding (t-SNE) is one of the most widely used dimensionality reduction methods for data visualization, but it has a perplexity hyperparameter that requires manual selection. In practice, proper tuning of t-SNE perplexity requires users to understand the inner working of the method as well as to have hands-on experience. We propose a model selection objective for t-SNE perplexity that requires negligible extra computation beyond that of the t-SNE itself. We empirically validate that the perplexity settings found by our approach are consistent with preferences elicited from human experts across a number of datasets. The similarities of our approach to Bayesian information criteria (BIC) and minimum description length (MDL) are also analyzed.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsEvolutionary Algorithms and Applications · Algorithms and Data Compression · Cellular Automata and Applications