Limiting distributions of graph-based test statistics on sparse and   dense graphs

Yejiong Zhu; Hao Chen

arXiv:2108.07446·math.ST·November 14, 2023

Limiting distributions of graph-based test statistics on sparse and dense graphs

Yejiong Zhu, Hao Chen

PDF

Open Access

TL;DR

This paper develops the theoretical understanding of graph-based two-sample tests, covering a spectrum from sparse to dense graphs, which enhances their applicability and performance in high-dimensional data analysis.

Contribution

It extends the asymptotic theory of graph-based tests to include much denser graphs than previously studied, relaxing earlier strong conditions.

Findings

01

Theoretical results for test statistics on dense graphs.

02

Validation of test performance across various graph densities.

03

Broader applicability of graph-based tests in high-dimensional settings.

Abstract

Two-sample tests utilizing a similarity graph on observations are useful for high-dimensional and non-Euclidean data due to their flexibility and good performance under a wide range of alternatives. Existing works mainly focused on sparse graphs, such as graphs with the number of edges in the order of the number of observations, and their asymptotic results imposed strong conditions on the graph that can easily be violated by commonly constructed graphs they suggested. Moreover, the graph-based tests have better performance with denser graphs under many settings. In this work, we establish the theoretical ground for graph-based tests with graphs ranging from those recommended in current literature to much denser ones.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsBayesian Methods and Mixture Models · Statistical Methods and Inference · Data-Driven Disease Surveillance