Fast metric embedding into the Hamming cube
Sjoerd Dirksen, Shahar Mendelson, Alexander Stollenwerk

TL;DR
This paper introduces a fast, data-oblivious method for embedding high-dimensional Euclidean data into a low-dimensional Hamming cube with near-isometric properties, using structured random matrices and binarization.
Contribution
It presents a novel, efficient embedding technique utilizing double circulant matrices that mimic Gaussian behavior, improving upon previous methods in speed and theoretical guarantees.
Findings
Achieves near-isometric embedding with high probability
Uses structured random matrices for fast computation
Provides optimal bounds on bits needed for encoding
Abstract
We consider the problem of embedding a subset of into a low-dimensional Hamming cube in an almost isometric way. We construct a simple, data-oblivious, and computationally efficient map that achieves this task with high probability: we first apply a specific structured random matrix, which we call the double circulant matrix; using that matrix requires linear storage and matrix-vector multiplication can be performed in near-linear time. We then binarize each vector by comparing each of its entries to a random threshold, selected uniformly at random from a well-chosen interval. We estimate the number of bits required for this encoding scheme in terms of two natural geometric complexity parameters of the set - its Euclidean covering numbers and its localized Gaussian complexity. The estimate we derive turns out to be the best that one can hope for - up to logarithmic…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopological and Geometric Data Analysis · Advanced Combinatorial Mathematics · Computational Geometry and Mesh Generation
