Optimal Fast Johnson-Lindenstrauss Embeddings for Large Data Sets

Stefan Bamberger; Felix Krahmer

arXiv:1712.01774·cs.DS·April 30, 2020

Optimal Fast Johnson-Lindenstrauss Embeddings for Large Data Sets

Stefan Bamberger, Felix Krahmer

PDF

TL;DR

This paper introduces a fast Johnson-Lindenstrauss embedding method that combines subsampled Hadamard transforms with a random projection to achieve near-optimal embedding dimensions for large data sets efficiently.

Contribution

It presents a novel two-step embedding approach that improves computational efficiency while approaching optimal embedding dimensions for large data sets.

Findings

01

Method achieves near-optimal embedding dimension for large data sets

02

Complexity approaches data reading cost under mild assumptions

03

Lower bound shows subsampled Hadamard alone is insufficient

Abstract

Johnson-Lindenstrauss embeddings are widely used to reduce the dimension and thus the processing time of data. To reduce the total complexity, also fast algorithms for applying these embeddings are necessary. To date, such fast algorithms are only available either for a non-optimal embedding dimension or up to a certain threshold on the number of data points. We address a variant of this problem where one aims to simultaneously embed larger subsets of the data set. Our method follows an approach by Nelson: A subsampled Hadamard transform maps points into a space of lower, but not optimal dimension. Subsequently, a random matrix with independent entries projects to an optimal embedding dimension. For subsets whose size scales at least polynomially in the ambient dimension, the complexity of this method comes close to the number of operations just to read the data under mild…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.