The AUGUST Two-Sample Test: Powerful, Interpretable, and Fast
Benjamin Brown (1), Kai Zhang (1) ((1) Department of Statistics and, Operations Research, University of North Carolina at Chapel Hill)

TL;DR
AUGUST is a new non-parametric univariate two-sample test that offers improved interpretability, computational efficiency, and competitive power across various contexts, including complex distributional differences.
Contribution
The paper introduces AUGUST, a novel two-sample test that uses symmetry statistics from binary expansion, providing exact distribution-freeness and interpretability.
Findings
AUGUST achieves power comparable to leading tests across multiple scenarios.
AUGUST demonstrates greater power in certain complex distributional cases.
AUGUST offers clear interpretability exemplified by NBA shooting data analysis.
Abstract
Two-sample testing is a fundamental problem in statistics, and many famous two-sample tests are designed to be fully non-parametric. These existing methods perform well with location and scale shifts but are less robust when faced with more exotic classes of alternatives, and rejections from these tests can be difficult to interpret. Here, we propose a new univariate non-parametric two-sample test, AUGUST, designed to improve on these aspects. AUGUST tests for inequality in distribution up to a predetermined resolution using symmetry statistics from binary expansion. The AUGUST statistic is exactly distribution-free and has a well-understood asymptotic distribution, permitting fast p-value computation. In empirical studies, we show that AUGUST has power comparable to that of the best existing methods in every context, as well as greater power in some circumstances. We illustrate the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Statistical Methods and Models · Statistical Methods and Bayesian Inference
