Why "Redefining Statistical Significance" Will Not Improve   Reproducibility and Could Make the Replication Crisis Worse

Harry Crane

arXiv:1711.07801·stat.AP·November 22, 2017

Why "Redefining Statistical Significance" Will Not Improve Reproducibility and Could Make the Replication Crisis Worse

Harry Crane

PDF

TL;DR

Redefining statistical significance from P<0.05 to P<0.005 does not reliably improve reproducibility and may worsen the replication crisis, especially when considering P-hacking effects.

Contribution

The paper critically evaluates claims that lowering the significance threshold improves reproducibility, highlighting the overlooked impact of P-hacking and potential negative consequences.

Findings

01

Perceived benefits of lower significance threshold are exaggerated.

02

Accounting for P-hacking diminishes the expected improvements.

03

Lowering the cutoff could worsen the replication crisis under certain scenarios.

Abstract

A recent proposal to "redefine statistical significance" (Benjamin, et al. Nature Human Behaviour, 2017) claims that false positive rates "would immediately improve" by factors greater than two and replication rates would double simply by changing the conventional cutoff for 'statistical significance' from P<0.05 to P<0.005. I analyze the veracity of these claims, focusing especially on how Benjamin, et al neglect the effects of P-hacking in assessing the impact of their proposal. My analysis shows that once P-hacking is accounted for the perceived benefits of the lower threshold all but disappear, prompting two main conclusions: (i) The claimed improvements to false positive rate and replication rate in Benjamin, et al (2017) are exaggerated and misleading. (ii) There are plausible scenarios under which the lower cutoff will make the replication crisis worse.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.