Statistical Methods for cis-Mendelian Randomization with Two-sample Summary-level Data
Apostolos Gkatzionis, Stephen Burgess, Paul J. Newcombe

TL;DR
This paper reviews statistical methods for cis-Mendelian randomization using summary-level data, focusing on variable selection and estimation techniques to improve causal inference with correlated genetic variants.
Contribution
It compares various methods for variable selection and estimation in cis-Mendelian randomization, highlighting the advantages of factor analysis and Bayesian approaches under weak instrument conditions.
Findings
Methods perform similarly with large samples and strong instruments.
Factor analysis and Bayesian methods are more reliable with weak instruments.
Pruning approaches are less stable in weak instrument scenarios.
Abstract
Mendelian randomization is the use of genetic variants to assess the existence of a causal relationship between a risk factor and an outcome of interest. Here, we focus on two-sample summary-data Mendelian randomization analyses with many correlated variants from a single gene region, and particularly on cis-Mendelian randomization studies which use protein expression as a risk factor. Such studies must rely on a small, curated set of variants from the studied region; using all variants in the region requires inverting an ill-conditioned genetic correlation matrix and results in numerically unstable causal effect estimates. We review methods for variable selection and estimation in cis-Mendelian randomization with summary-level data, ranging from stepwise pruning and conditional analysis to principal components analysis, factor analysis and Bayesian variable selection. In a simulation…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsPruning
