Choosing alpha post hoc: the danger of multiple standard significance   thresholds

Jesse Hemerik; Nick W Koning

arXiv:2410.02306·stat.AP·March 11, 2025

Choosing alpha post hoc: the danger of multiple standard significance thresholds

Jesse Hemerik, Nick W Koning

PDF

Open Access

TL;DR

The paper highlights the risks of selecting significance thresholds post hoc, especially when multiple standards coexist, which can invalidate hypothesis testing results.

Contribution

It demonstrates how multiple significance thresholds within a field can lead to biased alpha choices, undermining the validity of hypothesis tests.

Findings

01

Multiple thresholds can cause biased alpha selection

02

Post hoc alpha choice invalidates hypothesis testing

03

Potential solutions to mitigate this issue

Abstract

A fundamental assumption of classical hypothesis testing is that the significance threshold $α$ is chosen independently from the data. The validity of confidence intervals likewise relies on choosing $α$ beforehand. We point out that the independence of $α$ is guaranteed in practice because, in most fields, there exists one standard $α$ that everyone uses -- so that $α$ is automatically independent of everything. However, there have been recent calls to decrease $α$ from $0.05$ to $0.005$ . We note that this may lead to multiple accepted standard thresholds within one scientific field. For example, different journals may require different significance thresholds. As a consequence, some researchers may be tempted to conveniently choose their $α$ based on their p-value. We use examples to illustrate that this severely invalidates hypothesis tests, and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDelphi Technique in Research