Unsupervised Metric Learning in Presence of Missing Data

Anna C. Gilbert; Rishi Sonthalia

arXiv:1807.07610·cs.LG·March 5, 2019

Unsupervised Metric Learning in Presence of Missing Data

Anna C. Gilbert, Rishi Sonthalia

PDF

3 Repos

TL;DR

This paper introduces MR-MISSING, a novel algorithm for unsupervised metric learning that effectively handles missing data in high-dimensional manifold learning tasks, outperforming traditional methods in accuracy and classification tasks.

Contribution

The paper presents MR-MISSING, a new algorithm that extends existing dimension reduction techniques to work directly with incomplete data, providing theoretical guarantees and practical improvements.

Findings

01

Effective visualization on synthetic manifolds

02

Improved projection accuracy on MNIST with missing data

03

Successful classification on MNIST with incomplete data

Abstract

For many machine learning tasks, the input data lie on a low-dimensional manifold embedded in a high dimensional space and, because of this high-dimensional structure, most algorithms are inefficient. The typical solution is to reduce the dimension of the input data using standard dimension reduction algorithms such as ISOMAP, LAPLACIAN EIGENMAPS or LLES. This approach, however, does not always work in practice as these algorithms require that we have somewhat ideal data. Unfortunately, most data sets either have missing entries or unacceptably noisy values. That is, real data are far from ideal and we cannot use these algorithms directly. In this paper, we focus on the case when we have missing data. Some techniques, such as matrix completion, can be used to fill in missing data but these methods do not capture the non-linear structure of the manifold. Here, we present a new algorithm…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.