Maximum-of-Differences Test for Comparing Multivariate K-Sample Distributions
Wei Lan, Long Feng, Runze Li, Chih-Ling Tsai

TL;DR
This paper introduces a novel maximum-of-differences (MOD) test for comparing multivariate K-sample distributions, applicable to multivariate data and regression models, with proven asymptotic properties and demonstrated effectiveness.
Contribution
The paper proposes a new MOD test and its covariance-adjusted version for multivariate K-sample comparison, extending applicability and providing theoretical and empirical validation.
Findings
The CA-MOD test converges to the Type I extreme value distribution.
The tests perform well in simulations and real data examples.
The method is extendable to multivariate regression models.
Abstract
Comparing -sample distributions is a fundamental problem in data science that arises in a wide variety of fields and applications. In this article, we introduce a maximum-of-differences approach to make such comparisons. Specifically, we first calculate the pairwise distances from the pooled observations of the samples. We then define the two observations as connected if their distance is less than a pre-specified threshold value. For each observation, we next calculate the ``within" and the ``between" probabilities associated with these two types of connections for the given observation, i.e., with other observations within the same sample and between the given observation and the observations in other samples. Subsequently, we propose a maximum-of-differences (MOD) test that finds the maximum value among the standardized squared differences between the ``within" and the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
