Selecting invalid instruments to improve Mendelian randomization with two-sample summary data
Ashish Patel, Francis J. DiTraglia, Verena Zuber, and Stephen Burgess

TL;DR
This paper introduces a method for selecting genetic instruments in Mendelian randomization that balances bias and variance, allowing the inclusion of potentially invalid instruments to improve causal effect estimation.
Contribution
It proposes a novel focused instrument selection strategy that minimizes asymptotic mean squared error and constructs confidence intervals robust to invalid instruments.
Findings
Including many potentially invalid instruments can improve estimation accuracy.
The method effectively balances bias and variance in instrument selection.
Empirical applications validate the approach in lipid and vitamin D studies.
Abstract
Mendelian randomization (MR) is a widely-used method to estimate the causal relationship between a risk factor and disease. A fundamental part of any MR analysis is to choose appropriate genetic variants as instrumental variables. Genome-wide association studies often reveal that hundreds of genetic variants may be robustly associated with a risk factor, but in some situations investigators may have greater confidence in the instrument validity of only a smaller subset of variants. Nevertheless, the use of additional instruments may be optimal from the perspective of mean squared error even if they are slightly invalid; a small bias in estimation may be a price worth paying for a larger reduction in variance. For this purpose, we consider a method for "focused" instrument selection whereby genetic variants are selected to minimise the estimated asymptotic mean squared error of causal…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenetic Associations and Epidemiology · Advanced Causal Inference Techniques · Liver Disease Diagnosis and Treatment
