On High Dimensional Behaviour of Some Two-Sample Tests Based on Ball Divergence
Bilol Banerjee, Anil K. Ghosh

TL;DR
This paper introduces two-sample tests based on ball divergence, analyzing their high-dimensional behavior, consistency, and power, especially in HDLSS settings, with theoretical proofs and empirical comparisons.
Contribution
It develops new high-dimensional two-sample tests based on ball divergence and establishes their consistency and optimality under various growth conditions.
Findings
Tests are consistent in HDLSS regime under regularity conditions.
Power of tests can approach one when sample sizes grow with dimension.
Proposed tests outperform existing methods in simulations and benchmarks.
Abstract
In this article, we propose some two-sample tests based on ball divergence and investigate their high dimensional behavior. First, we study their behavior for High Dimension, Low Sample Size (HDLSS) data, and under appropriate regularity conditions, we establish their consistency in the HDLSS regime, where the dimension of the data grows to infinity while the sample sizes from the two distributions remain fixed. Further, we show that these conditions can be relaxed when the sample sizes also increase with the dimension, and in such cases, consistency can be proved even for shrinking alternatives. We use a simple example involving two normal distributions to prove that even when there are no consistent tests in the HDLSS regime, the powers of the proposed tests can converge to unity if the sample sizes increase with the dimension at an appropriate rate. This rate is obtained by…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Methods and Inference · Bayesian Methods and Mixture Models · Gene expression and cancer classification
