Measure of Strength of Evidence for Visually Observed Differences between Subpopulations
Xi Yang, Jan Hannig, Katherine A. Hoadley, Iain Carmichael, J.S., Marron

TL;DR
This paper introduces the Population Difference Criterion for assessing the significance of visually observed subpopulation differences, addressing challenges in high-dimensional and high-signal contexts with permutation and bootstrap methods.
Contribution
It proposes a new criterion for significance testing of subpopulation differences, incorporating a balanced permutation approach and bootstrap confidence intervals for uncertainty quantification.
Findings
Balanced permutation approach outperforms conventional methods in high-signal contexts.
Bootstrap confidence intervals effectively quantify permutation variation.
Method demonstrated useful in cancer subpopulation analysis.
Abstract
For measuring the strength of visually-observed subpopulation differences, the Population Difference Criterion is proposed to assess the statistical significance of visually observed subpopulation differences. It addresses the following challenges: in high-dimensional contexts, distributional models can be dubious; in high-signal contexts, conventional permutation tests give poor pairwise comparisons. We also make two other contributions: Based on a careful analysis we find that a balanced permutation approach is more powerful in high-signal contexts than conventional permutations. Another contribution is the quantification of uncertainty due to permutation variation via a bootstrap confidence interval. The practical usefulness of these ideas is illustrated in the comparison of subpopulations of modern cancer data.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData-Driven Disease Surveillance · Computational Drug Discovery Methods · Evolution and Genetic Dynamics
