Minimum-Distortion Embedding

Akshay Agrawal; Alnur Ali; Stephen Boyd

arXiv:2103.02559·cs.LG·August 26, 2021

Minimum-Distortion Embedding

Akshay Agrawal, Alnur Ali, Stephen Boyd

PDF

1 Repo

TL;DR

This paper introduces the minimum-distortion embedding (MDE) framework, a versatile approach for vector embedding that unifies many existing methods and provides scalable algorithms and software for large datasets.

Contribution

The paper formalizes the MDE problem, develops a scalable quasi-Newton algorithm, and implements an open-source Python package for flexible and efficient embeddings.

Findings

01

MDE encompasses many existing embedding techniques.

02

The proposed algorithm scales to datasets with millions of items.

03

PyMDE software enables rapid experimentation with various embeddings.

Abstract

We consider the vector embedding problem. We are given a finite set of items, with the goal of assigning a representative vector to each one, possibly under some constraints (such as the collection of vectors being standardized, i.e., having zero mean and unit covariance). We are given data indicating that some pairs of items are similar, and optionally, some other pairs are dissimilar. For pairs of similar items, we want the corresponding vectors to be near each other, and for dissimilar pairs, we want the corresponding vectors to not be near each other, measured in Euclidean distance. We formalize this by introducing distortion functions, defined for some pairs of the items. Our goal is to choose an embedding that minimizes the total distortion, subject to the constraints. We call this the minimum-distortion embedding (MDE) problem. The MDE framework is simple but general. It…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

cvxgrp/pymde
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.