p-Hacking Inflates Type I Error Rates in the Error Statistical Approach but not in the Formal Inference Approach
Mark Rubin

TL;DR
This paper compares two significance testing philosophies, showing p-hacking inflates Type I errors in the error statistical approach but not in the formal inference approach, with implications for research practices.
Contribution
It clarifies how p-hacking affects Type I error rates differently depending on the significance testing philosophy used.
Findings
P-hacking inflates Type I error in the error statistical approach.
In the formal inference approach, p-hacking does not inflate Type I error.
Implications for research practices and p-hacking reduction are discussed.
Abstract
p-hacking occurs when researchers conduct multiple significance tests (e.g., p1;H0,1 and p2;H0,2) and then selectively report tests that yield desirable (usually significant) results (e.g., p2 < 0.05;H0,2) without correcting for multiple testing (e.g., 0.05/2 = 0.025). In the present article, I consider p-hacking in the context of two philosophies of significance testing - the error statistical approach and the formal inference approach. I argue that although p-hacking inflates Type I error rates in the error statistical approach, it does not inflate them in the formal inference approach. Specifically, in the error statistical approach, the "actual" familywise error rate (e.g., 1 - [1 - 0.05]2 = 0.098 for two independent tests) is relevant because it covers both the reported and unreported tests in the "actual" test procedure (i.e., p1;H0,1 and p2;H0,2). In this approach, Type I error…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMeta-analysis and systematic reviews · Statistical Methods in Clinical Trials · Advanced Causal Inference Techniques
