Optimal Projections in the Distance-Based Statistical Methods
Chuanping Yu, Xiaoming Huo

TL;DR
This paper proposes an optimized projection method for distance-based statistics in multivariate data, reducing computational complexity and improving accuracy over random projection approaches.
Contribution
It introduces a novel approach to pre-calculate optimal projection directions, enabling faster computation and better approximation of distances in multivariate statistical tests.
Findings
Significant computational speed-up from $O(m^2)$ to $O(n m \, \log m)$.
Exact solutions for optimal projections in 2D and specific cases.
Simulation results show improved accuracy over random projections.
Abstract
This paper introduces a new way to calculate distance-based statistics, particularly when the data are multivariate. The main idea is to pre-calculate the optimal projection directions given the variable dimension, and to project multidimensional variables onto these pre-specified projection directions; by subsequently utilizing the fast algorithm that is developed in Huo and Sz\'ekely [2016] for the univariate variables, the computational complexity can be improved from to , where is the number of projection directions and is the sample size. When , computational savings can be achieved. The key challenge is how to find the optimal pre-specified projection directions. This can be obtained by minimizing the worse-case difference between the true distance and the approximated distance, which can be formulated as a nonconvex…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Statistical Methods and Models · Statistical Methods and Inference · Machine Learning and Algorithms
