Parametric UMAP embeddings for representation and semi-supervised   learning

Tim Sainburg; Leland McInnes; Timothy Q Gentner

arXiv:2009.12981·cs.LG·August 31, 2021

Parametric UMAP embeddings for representation and semi-supervised learning

Tim Sainburg, Leland McInnes, Timothy Q Gentner

PDF

2 Repos

TL;DR

This paper introduces Parametric UMAP, a neural network-based extension of UMAP that enables fast, online embeddings and improves semi-supervised learning by leveraging learned data-structure relationships.

Contribution

It extends UMAP to a parametric form using neural networks, allowing for rapid embeddings and enhanced semi-supervised learning capabilities.

Findings

01

Parametric UMAP performs comparably to non-parametric UMAP.

02

It enables fast online embedding of new data points.

03

It improves classifier accuracy in semi-supervised learning.

Abstract

UMAP is a non-parametric graph-based dimensionality reduction algorithm using applied Riemannian geometry and algebraic topology to find low-dimensional embeddings of structured data. The UMAP algorithm consists of two steps: (1) Compute a graphical representation of a dataset (fuzzy simplicial complex), and (2) Through stochastic gradient descent, optimize a low-dimensional embedding of the graph. Here, we extend the second step of UMAP to a parametric optimization over neural network weights, learning a parametric relationship between data and embedding. We first demonstrate that Parametric UMAP performs comparably to its non-parametric counterpart while conferring the benefit of a learned parametric mapping (e.g. fast online embeddings for new data). We then explore UMAP as a regularization, constraining the latent distribution of autoencoders, parametrically varying global structure…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsParametric UMAP