Visualizing Large-scale and High-dimensional Data

Jian Tang; Jingzhou Liu; Ming Zhang; Qiaozhu Mei

arXiv:1602.00370·cs.LG·April 6, 2016

Visualizing Large-scale and High-dimensional Data

Jian Tang, Jingzhou Liu, Ming Zhang, Qiaozhu Mei

PDF

5 Repos

TL;DR

LargeVis is a scalable visualization technique for large-scale, high-dimensional data that constructs an approximate K-nearest neighbor graph and employs a probabilistic model optimized with stochastic gradient descent, outperforming existing methods like t-SNE.

Contribution

The paper introduces LargeVis, a novel scalable visualization method that significantly reduces computational costs and improves stability compared to prior techniques like t-SNE.

Findings

01

LargeVis scales to millions of data points.

02

LargeVis outperforms t-SNE in efficiency and effectiveness.

03

Hyper-parameters of LargeVis are more stable across datasets.

Abstract

We study the problem of visualizing large-scale and high-dimensional data in a low-dimensional (typically 2D or 3D) space. Much success has been reported recently by techniques that first compute a similarity structure of the data points and then project them into a low-dimensional space with the structure preserved. These two steps suffer from considerable computational costs, preventing the state-of-the-art methods such as the t-SNE from scaling to large-scale and high-dimensional data (e.g., millions of data points and hundreds of dimensions). We propose the LargeVis, a technique that first constructs an accurately approximated K-nearest neighbor graph from the data and then layouts the graph in the low-dimensional space. Comparing to t-SNE, LargeVis significantly reduces the computational cost of the graph construction step and employs a principled probabilistic model for the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.