TL;DR
This paper introduces a recalibrated sceptical p-value framework for assessing replicability, ensuring exact Type-I error control, improving power, and optimizing sample size planning in replication studies.
Contribution
It proposes a new statistical approach that avoids dichotomous significance testing, provides exact error control, and enhances power and efficiency in replication research.
Findings
Exact Type-I error control achieved
Increased project power over two-trials rule
Smaller sample sizes needed for convincing original studies
Abstract
We study a statistical framework for replicability based on a recently proposed quantitative measure of replication success, the sceptical -value. A recalibration is proposed to obtain exact overall Type-I error control if the effect is null in both studies and additional bounds on the partial and conditional Type-I error rate, which represent the case where only one study has a null effect. The approach avoids the double dichotomization for significance of the two-trials rule and has larger project power to detect existing effects over both studies in combination. It can also be used for power calculations and requires a smaller replication sample size than the two-trials rule for already convincing original studies. We illustrate the performance of the proposed methodology in an application to data from the Experimental Economics Replication Project.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
