Assessing the replicability of RCTs in RWE emulations
Jeanette K\"oppe, Charlotte Micheloud, Stella Erdmann, Rachel Heyard,, Leonhard Held

TL;DR
This paper evaluates the sceptical p-value as a new statistical method for assessing the replicability of RCTs in real-world evidence studies, comparing it to the traditional two-trials rule and demonstrating its advantages.
Contribution
It introduces the sceptical p-value for replicability assessment, extends its methodology to non-inferiority trials, and compares its performance with the two-trials rule using real-world emulated RCTs.
Findings
Sceptical p-value depends on p-values, sample size, and effect size.
It has larger power to detect true effects than the two-trials rule.
In real-world data, it shows more evidence for treatment effects.
Abstract
Background: The standard regulatory approach to assess replication success is the two-trials rule, requiring both the original and the replication study to be significant with effect estimates in the same direction. The sceptical p-value was recently presented as an alternative method for the statistical assessment of the replicability of study results. Methods: We review the statistical properties of the sceptical p-value and compare those to the two-trials rule. We extend the methodology to non-inferiority trials and describe how to invert the sceptical p-value to obtain confidence intervals. We illustrate the performance of the different methods using real-world evidence emulations of randomized, controlled trials (RCTs) conducted within the RCT DUPLICATE initiative. Results: The sceptical p-value depends not only on the two p-values, but also on sample size and effect size of the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNuclear and radioactivity studies
