TL;DR
This paper introduces a unified method for sample size planning for the Wilcoxon-Mann-Whitney test across various data types, providing formulas for optimal sample sizes and allocation ratios based on distribution characteristics.
Contribution
It presents a comprehensive approach that covers multiple data types and derives formulas for optimal sample size and allocation, including conditions for balanced designs.
Findings
Balanced design is optimal for certain distributions.
Optimal allocation depends on the ratio of variances under the alternative.
The method applies to metric, categorical, and dichotomous data.
Abstract
There are many different proposed procedures for sample size planning for the Wilcoxon-Mann-Whitney test at given type-I and type-II error rates and , respectively. Most methods assume very specific models or types of data in order to simplify calculations (for example, ordered categorical or metric data, location shift alternatives, etc.). We present a unified approach that covers metric data with and without ties, count data, ordered categorical data, and even dichotomous data. For that, we calculate the unknown theoretical quantities such as the variances under the null and relevant alternative hypothesis by considering the following `synthetic data' approach. We evaluate data whose empirical distribution functions match with the theoretical distribution functions involved in the computations of the unknown theoretical quantities. Then well-known relations for the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
