Choosing alpha post hoc: the danger of multiple standard significance thresholds
Jesse Hemerik, Nick W Koning

TL;DR
The paper highlights the risks of selecting significance thresholds post hoc, especially when multiple standards coexist, which can invalidate hypothesis testing results.
Contribution
It demonstrates how multiple significance thresholds within a field can lead to biased alpha choices, undermining the validity of hypothesis tests.
Findings
Multiple thresholds can cause biased alpha selection
Post hoc alpha choice invalidates hypothesis testing
Potential solutions to mitigate this issue
Abstract
A fundamental assumption of classical hypothesis testing is that the significance threshold is chosen independently from the data. The validity of confidence intervals likewise relies on choosing beforehand. We point out that the independence of is guaranteed in practice because, in most fields, there exists one standard that everyone uses -- so that is automatically independent of everything. However, there have been recent calls to decrease from to . We note that this may lead to multiple accepted standard thresholds within one scientific field. For example, different journals may require different significance thresholds. As a consequence, some researchers may be tempted to conveniently choose their based on their p-value. We use examples to illustrate that this severely invalidates hypothesis tests, and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDelphi Technique in Research
