A New ECDF Two-Sample Test Statistic

Connor Dowd

arXiv:2007.01360·stat.ME·July 6, 2020·24 cites

A New ECDF Two-Sample Test Statistic

Connor Dowd

PDF

Open Access 1 Repo

TL;DR

This paper introduces a new ECDF-based two-sample test statistic that offers higher power than existing tests, along with a practical R implementation and simulation results demonstrating its effectiveness.

Contribution

A novel ECDF two-sample test statistic with improved power and a computationally efficient R package implementation.

Findings

01

The new test outperforms existing ECDF-based tests in power.

02

The R package enables efficient finite-sample p-value computation.

03

Simulation studies confirm the test's superior performance.

Abstract

Empirical cumulative distribution functions (ECDFs) have been used to test the hypothesis that two samples come from the same distribution since the seminal contribution by Kolmogorov and Smirnov. This paper describes a statistic which is usable under the same conditions as Kolmogorov-Smirnov, but provides more power than other extant tests in that vein. I demonstrate a valid (conservative) procedure for producing finite-sample p-values. I outline the close relationship between this statistic and its two main predecessors. I also provide a public R package (CRAN: twosamples [2018]) implementing the testing procedure in $O (N lo g (N))$ time with $O (N)$ memory. Using the package's functions, I perform several simulation studies showing the power improvements.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

cdowd/twosamples
none

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStatistical Methods and Inference · Statistical Methods and Bayesian Inference · Statistical Methods in Clinical Trials