Cross-screening in observational studies that test many hypotheses
Qingyuan Zhao, Dylan S. Small, Paul R. Rosenbaum

TL;DR
This paper introduces 'cross-screening', a method for testing many causal hypotheses in observational studies that controls error rates and enhances replicability by splitting data and cross-validating findings.
Contribution
It proposes a novel cross-screening strategy that improves validity and replicability in large-scale observational hypothesis testing, especially with many hypotheses.
Findings
Cross-screening controls family-wise error rate effectively.
The method is particularly useful for studies with hundreds or thousands of hypotheses.
Applied to biomarkers in fish consumption, demonstrating practical utility.
Abstract
We discuss observational studies that test many causal hypotheses, either hypotheses about many outcomes or many treatments. To be credible an observational study that tests many causal hypotheses must demonstrate that its conclusions are neither artifacts of multiple testing nor of small biases from nonrandom treatment assignment. In a sense that needs to be defined carefully, hidden within a sensitivity analysis for nonrandom assignment is an enormous correction for multiple testing: in the absence of bias, it is extremely improbable that multiple testing alone would create an association insensitive to moderate biases. We propose a new strategy called "cross-screening", different from but motivated by recent work of Bogomolov and Heller on replicability. Cross-screening splits the data in half at random, uses the first half to plan a study carried out on the second half, then uses…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Causal Inference Techniques · Statistical Methods in Clinical Trials · Statistical Methods and Bayesian Inference
