How to Tell When a Result Will Replicate: Significance and Replication in Distributional Null Hypothesis Tests
Fintan Costello, Paul Watts

TL;DR
This paper introduces a distributional null hypothesis testing method that accounts for both within- and between-experiment variation, improving the prediction of result replicability over standard significance tests.
Contribution
It extends traditional significance testing by analyzing between-experiment variation, providing a more accurate estimate of replication probability.
Findings
Many significant results are consistent with random variation when between-experiment variation is considered.
Predicted replication probabilities strongly correlate with actual replication rates.
The new method offers a more reliable statistical tool for assessing result replicability.
Abstract
There is a well-known problem in Null Hypothesis Significance Testing: many statistically significant results fail to replicate in subsequent experiments. We show that this problem arises because standard `point-form null' significance tests consider only within-experiment but ignore between-experiment variation, and so systematically underestimate the degree of random variation in results. We give an extension to standard significance testing that addresses this problem by analysing both within- and between-experiment variation. This `distributional null' approach does not underestimate experimental variability and so is not overconfident in identifying significance; because this approach addresses between-experiment variation, it gives mathematically coherent estimates for the probability of replication of significant results. Using a large-scale replication dataset (the first `Many…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Methods and Inference · Statistical Methods in Clinical Trials
