Linear Dimensionality Reduction in Linear Time: Johnson-Lindenstrauss-type Guarantees for Random Subspace
Nick Lim, Robert J. Durrant

TL;DR
This paper introduces a fast, data-dependent Johnson-Lindenstrauss-type dimensionality reduction method using random subspaces, with guarantees for norm preservation applicable to both dense and sparse data.
Contribution
It provides theoretical guarantees for random subspace methods in dimensionality reduction, including a novel densifying preprocessing for sparse data, supported by empirical validation.
Findings
Random subspace preserves Euclidean geometry with high probability.
Densifying preprocessing improves performance on sparse data.
Projection dimension is logarithmic in data size, with regularity-dependent constants.
Abstract
We consider the problem of efficient randomized dimensionality reduction with norm-preservation guarantees. Specifically we prove data-dependent Johnson-Lindenstrauss-type geometry preservation guarantees for Ho's random subspace method: When data satisfy a mild regularity condition -- the extent of which can be estimated by sampling from the data -- then random subspace approximately preserves the Euclidean geometry of the data with high probability. Our guarantees are of the same order as those for random projection, namely the required dimension for projection is logarithmic in the number of data points, but have a larger constant term in the bound which depends upon this regularity. A challenging situation is when the original data have a sparse representation, since this implies a very large projection dimension is required: We show how this situation can be improved for sparse…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Sparse and Compressive Sensing Techniques · Markov Chains and Monte Carlo Methods
