Mitigating dimensionality effects with robust graph constructions for testing
Yejiong Zhu, Hao Chen

TL;DR
This paper introduces a robust graph construction method to reduce hub formation in high-dimensional data, significantly improving the power of graph-based two-sample tests and change-point detection across various applications.
Contribution
It proposes a novel graph construction technique that mitigates hub effects, enhancing the effectiveness of graph-based statistical tests in high-dimensional and non-Euclidean data.
Findings
Improved test power across diverse high-dimensional datasets
Theoretical proof of consistency under fixed alternatives
Successful real-world applications in neuroscience, genomics, and urban data
Abstract
Dimensionality effects pose major challenges in high-dimensional and non-Euclidean data analysis. Graph-based two-sample tests and change-point detection are particularly attractive in this context, as they make minimal distributional assumptions and perform well across a wide range of scenarios. These methods rely on similarity graphs constructed from data, with -nearest neighbor graphs and -minimum spanning trees among the most effective and widely used. However, in high-dimensional and non-Euclidean regimes such graphs often produce hubs -- nodes with disproportionately high degrees -- to which graph-based methods are especially sensitive. To mitigate these dimensionality effects, we propose a robust graph construction that is far less prone to hub formation. Incorporating this construction substantially improves the power of graph-based methods across diverse settings. We…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Causal Inference Techniques · Statistical Methods and Inference
