An investigation of the false discovery rate and the misinterpretation of P values
David Colquhoun

TL;DR
This paper critically examines the misuse of P values in scientific research, highlighting that common thresholds like 0.05 lead to high false discovery rates and recommending stricter criteria to improve reliability.
Contribution
It provides a rigorous analysis of false discovery rates associated with P value thresholds and advocates for more stringent standards to reduce false positives.
Findings
Using P=0.05 results in at least 30% false discoveries
Under-powered experiments increase the likelihood of false positives
Recommends P<0.001 or 3-sigma rule for reliable discoveries
Abstract
The following proposition is justified from several different points of view. If you use P = 0.05 to suggest that you have made a discovery, you will be wrong at least 30 percent of the time. If, as is often the case, experiments are under-powered, you will be wrong most of the time. It is concluded that if you wish to keep your false discovery rate below 5 percent, you need to use a 3-sigma rule, or to insist on P value below 0.001. And never use the word "significant".
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
