Minimax Optimality of Sign Test for Paired Heterogeneous Data
Martin J. Zhang, Meisam Razaviyayn, David Tse

TL;DR
This paper demonstrates that the sign test is minimax optimal for comparing paired heterogeneous Gaussian data, outperforming other tests in worst-case scenarios, with validation on synthetic and real RNA-Seq data.
Contribution
It establishes the minimax optimality of the sign test for paired heterogeneous data, a setting often overlooked by traditional methods.
Findings
Sign test is optimal in one-sided comparisons.
Sign test is near optimal in two-sided comparisons.
Empirical validation shows sign test outperforms other methods.
Abstract
Comparing two groups under different conditions is ubiquitous in the biomedical sciences. In many cases, samples from the two groups can be naturally paired; for example a pair of samples may come from the same individual under the two conditions. However samples across different individuals may be highly heterogeneous. Traditional methods often ignore such heterogeneity by assuming the samples are identically distributed. In this work, we study the problem of comparing paired heterogeneous data by modeling the data as Gaussian distributed with different parameters across the samples. We show that in the minimax setting where we want to maximize the worst-case power, the sign test, which only uses the signs of the differences between the paired sample, is optimal in the one-sided case and near optimal in the two-sided case. The superiority of the sign test over other popular tests for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGene expression and cancer classification · Statistical Methods and Inference · Bayesian Methods and Mixture Models
