A Short Note on P-Value Hacking
Nassim Nicholas Taleb

TL;DR
This paper analyzes how p-value hacking affects the interpretation of statistical tests, revealing that the minimum p-value obtained through multiple tests can be highly skewed and volatile, impacting reproducibility and meta-analysis.
Contribution
It provides an exact distribution for p-values under multiple testing scenarios, highlighting the extreme skewness and volatility caused by p-hacking and small sample sizes.
Findings
P-values are highly skewed and volatile across repetitions.
Minimum p-values can significantly underestimate the true p-value.
Increasing sample size or lowering p-value threshold reduces volatility.
Abstract
We present the expected values from p-value hacking as a choice of the minimum p-value among independents tests, which can be considerably lower than the "true" p-value, even with a single trial, owing to the extreme skewness of the meta-distribution. We first present an exact probability distribution (meta-distribution) for p-values across ensembles of statistically identical phenomena. We derive the distribution for small samples as well as the limiting one as the sample size becomes large. We also look at the properties of the "power" of a test through the distribution of its inverse for a given p-value and parametrization. The formulas allow the investigation of the stability of the reproduction of results and "p-hacking" and other aspects of meta-analysis. P-values are shown to be extremely skewed and volatile, regardless of the sample size…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMeta-analysis and systematic reviews · Statistical Methods in Clinical Trials · Data Analysis with R
