A New Framework for Distance and Kernel-based Metrics in High Dimensions
Shubhadeep Chakraborty, Xianyang Zhang

TL;DR
This paper introduces new metrics for high-dimensional distribution comparison and independence testing, overcoming limitations of traditional energy distance, with scalable t-tests demonstrating superior power on simulated and real data.
Contribution
The paper proposes a novel class of metrics that fully characterize distribution equality and independence in high dimensions, with scalable t-tests and proven asymptotic properties.
Findings
New metrics detect homogeneity and independence in high dimensions.
Proposed t-tests have linear computational complexity.
Tests show superior power in simulations and real data.
Abstract
The paper presents new metrics to quantify and test for (i) the equality of distributions and (ii) the independence between two high-dimensional random vectors. We show that the energy distance based on the usual Euclidean distance cannot completely characterize the homogeneity of two high-dimensional distributions in the sense that it only detects the equality of means and the traces of covariance matrices in the high-dimensional setup. We propose a new class of metrics which inherits the desirable properties of the energy distance and maximum mean discrepancy/(generalized) distance covariance and the Hilbert-Schmidt Independence Criterion in the low-dimensional setting and is capable of detecting the homogeneity of/completely characterizing independence between the low-dimensional marginal distributions in the high dimensional setup. We further propose t-tests based on the new metrics…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Methods and Inference · Random Matrices and Applications · Bayesian Methods and Mixture Models
