A new test for the multivariate two-sample problem based on the concept of minimum energy
Guenter Zech, Berkan Aslan

TL;DR
This paper introduces a new multivariate two-sample test based on the concept of energy, which measures differences between distributions using a logarithmic function of distances, and demonstrates its superior power in multidimensional settings.
Contribution
The paper proposes a novel energy-based test for the multivariate two-sample problem, with a resampling method for distribution determination and improved performance in higher dimensions.
Findings
The energy test is especially powerful in multidimensional applications.
The distribution of the test statistic is effectively determined by resampling.
Compared to existing tests, the energy test shows superior power in higher dimensions.
Abstract
We introduce a new statistical quantity the energy to test whether two samples originate from the same distributions. The energy is a simple logarithmic function of the distances of the observations in the variate space. The distribution of the test statistic is determined by a resampling method. The power of the energy test in one dimension was studied for a variety of different test samples and compared to several nonparametric tests. In two and four dimensions a comparison was performed with the Friedman-Rafsky and nearest neighbor tests. The two-sample energy test was shown to be especially powerful in multidimensional applications.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Statistical Methods and Models · Statistical Mechanics and Entropy · Statistical Methods and Inference
