On Approximating the Lp Distances for p>2
Ping Li

TL;DR
This paper introduces a method to efficiently approximate pairwise Lp distances for even p in high-dimensional data using random projections, reducing computational and memory costs while maintaining accuracy.
Contribution
It proposes a novel decomposition of Lp distances for even p and two strategies for applying random projections, with theoretical analysis and practical considerations.
Findings
The basic projection strategy is more accurate for non-negative data.
The method reduces computational complexity for high-dimensional data.
The approach is applicable to large-scale machine learning tasks.
Abstract
Applications in machine learning and data mining require computing pairwise Lp distances in a data matrix A. For massive high-dimensional data, computing all pairwise distances of A can be infeasible. In fact, even storing A or all pairwise distances of A in the memory may be also infeasible. This paper proposes a simple method for p = 2, 4, 6, ... We first decompose the l_p (where p is even) distances into a sum of 2 marginal norms and p-1 ``inner products'' at different orders. Then we apply normal or sub-Gaussian random projections to approximate the resultant ``inner products,'' assuming that the marginal norms can be computed exactly by a linear scan. We propose two strategies for applying random projections. The basic projection strategy requires only one projection matrix but it is more difficult to analyze, while the alternative projection strategy requires p-1 projection…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Management and Algorithms · Face and Expression Recognition · Random Matrices and Applications
