A simple strategy for valid inference in target trial emulations
Mats Julius Stensrud

TL;DR
This paper introduces a sample splitting method for target trial emulation that allows data-informed protocol development while maintaining valid statistical inference guarantees.
Contribution
It proposes a simple, practical procedure based on sample splitting to address concerns about bias and invalid inference in iterative target trial protocol development.
Findings
The method ensures usual coverage guarantees despite data-driven protocol choices.
It mirrors the process of moving from pilot studies to definitive trials.
The approach is straightforward and applicable in observational study settings.
Abstract
Target trial emulation has improved comparative effectiveness research by making the causal question, assumptions, and analysis plan explicit. However, target trial protocols are usually developed iteratively. After examining the data, investigators revise the protocol to reflect which target trials the observational data can realistically support. While this iterative procedure is part of normal scientific practice, it raises concerns about selective choices and invalid statistical inference. A simple procedure can address these concerns. This procedure is based on sample splitting. In the initial split, investigators explore the data to define a target trial protocol. When these choices are made, the target trial protocol is implemented on the second split. Although the investigators made data-informed choices to select the target trial protocol, the inference has the usual coverage…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
