Using Dimensionality Reduction to Optimize t-SNE

Rikhav Shah; Sandeep Silwal

arXiv:1912.01098·cs.LG·December 4, 2019·5 cites

Using Dimensionality Reduction to Optimize t-SNE

Rikhav Shah, Sandeep Silwal

PDF

Open Access 1 Repo

TL;DR

This paper proposes a method combining random projections with t-SNE to reduce computational costs when embedding high-dimensional data, maintaining clustering quality while speeding up the process.

Contribution

It introduces a novel approach of using random projections before t-SNE to efficiently handle high-dimensional datasets with minimal loss of clustering quality.

Findings

01

Random projections preserve t-SNE clustering results.

02

Significant reduction in embedding computation time.

03

Effective for high-dimensional data in unsupervised learning.

Abstract

t-SNE is a popular tool for embedding multi-dimensional datasets into two or three dimensions. However, it has a large computational cost, especially when the input data has many dimensions. Many use t-SNE to embed the output of a neural network, which is generally of much lower dimension than the original data. This limits the use of t-SNE in unsupervised scenarios. We propose using \textit{random} projections to embed high dimensional datasets into relatively few dimensions, and then using t-SNE to obtain a two dimensional embedding. We show that random projections preserve the desirable clustering achieved by t-SNE, while dramatically reducing the runtime of finding the embedding.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ssilwa/optml
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsFace and Expression Recognition · Neural Networks and Applications · Stochastic Gradient Optimization Techniques