A General Asymptotic Framework for Distribution-Free Graph-Based Two-Sample Tests
Bhaswar B. Bhattacharya

TL;DR
This paper introduces a unified asymptotic framework for distribution-free graph-based two-sample tests, analyzing their efficiency and guiding their practical application in multivariate distribution comparison.
Contribution
It provides a general theoretical framework for analyzing and comparing various graph-based two-sample tests, including new insights into their asymptotic efficiency.
Findings
The asymptotic efficiency depends on the combinatorial properties of the underlying graph.
The framework unifies analysis of tests based on geometric graphs and multivariate depth functions.
Guidelines for selecting appropriate two-sample tests in practice are derived.
Abstract
Testing equality of two multivariate distributions is a classical problem for which many non-parametric tests have been proposed over the years. Most of the popular two-sample tests, which are asymptotically distribution-free, are based either on geometric graphs constructed using inter-point distances between the observations (multivariate generalizations of the Wald-Wolfowitz's runs test) or on multivariate data-depth (generalizations of the Mann-Whitney rank test). This paper introduces a general notion of distribution-free graph-based two-sample tests, and provides a unified framework for analyzing and comparing their asymptotic properties. The asymptotic (Pitman) efficiency of a general graph-based test is derived, which include tests based on geometric graphs, such as the Friedman-Rafsky test (1979), the test based on the -nearest neighbor graph, the cross-match test (2005),…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
