A General Asymptotic Framework for Distribution-Free Graph-Based   Two-Sample Tests

Bhaswar B. Bhattacharya

arXiv:1508.07530·math.ST·April 17, 2019

A General Asymptotic Framework for Distribution-Free Graph-Based Two-Sample Tests

Bhaswar B. Bhattacharya

PDF

TL;DR

This paper introduces a unified asymptotic framework for distribution-free graph-based two-sample tests, analyzing their efficiency and guiding their practical application in multivariate distribution comparison.

Contribution

It provides a general theoretical framework for analyzing and comparing various graph-based two-sample tests, including new insights into their asymptotic efficiency.

Findings

01

The asymptotic efficiency depends on the combinatorial properties of the underlying graph.

02

The framework unifies analysis of tests based on geometric graphs and multivariate depth functions.

03

Guidelines for selecting appropriate two-sample tests in practice are derived.

Abstract

Testing equality of two multivariate distributions is a classical problem for which many non-parametric tests have been proposed over the years. Most of the popular two-sample tests, which are asymptotically distribution-free, are based either on geometric graphs constructed using inter-point distances between the observations (multivariate generalizations of the Wald-Wolfowitz's runs test) or on multivariate data-depth (generalizations of the Mann-Whitney rank test). This paper introduces a general notion of distribution-free graph-based two-sample tests, and provides a unified framework for analyzing and comparing their asymptotic properties. The asymptotic (Pitman) efficiency of a general graph-based test is derived, which include tests based on geometric graphs, such as the Friedman-Rafsky test (1979), the test based on the $K$ -nearest neighbor graph, the cross-match test (2005),…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.