Asymptotic Distribution and Detection Thresholds for Two-Sample Tests Based on Geometric Graphs
Bhaswar B. Bhattacharya

TL;DR
This paper analyzes the asymptotic behavior and detection capabilities of geometric graph-based two-sample tests, such as those using minimum spanning trees and K-nearest neighbor graphs, under general alternatives.
Contribution
It derives the asymptotic distribution and detection thresholds for these tests using stabilizing geometric graph theory, providing insights into their power properties.
Findings
Asymptotic distribution of tests under general alternatives
Detection thresholds depending on dimension
Comparison of test performance in various scenarios
Abstract
In this paper, we consider the problem of testing the equality of two multivariate distributions based on geometric graphs constructed using the interpoint distances between the observations. These include the tests based on the minimum spanning tree and the -nearest neighbor (NN) graphs, among others. These tests are asymptotically distribution-free, universally consistent and computationally efficient, making them particularly useful in modern applications. However, very little is known about the power properties of these tests. In this paper, using the theory of stabilizing geometric graphs, we derive the asymptotic distribution of these tests under general alternatives, in the Poissonized setting. Using this, the detection threshold and the limiting local power of the test based on the -NN graph are obtained, where interesting exponents depending on dimension emerge. This…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRandom Matrices and Applications · Stochastic processes and statistical mechanics · Bayesian Methods and Mixture Models
