Mendelian randomization with fine-mapped genetic data: choosing from large numbers of correlated instrumental variables
Stephen Burgess, Verena Zuber, Elsa Valdes-Marquez, Benjamin B Sun,, Jemma C Hopewell

TL;DR
This paper introduces a robust Mendelian randomization method using principal components analysis on summarized genetic data, effectively handling many correlated variants and reducing false positives in causal inference.
Contribution
It proposes a novel PCA-based approach for Mendelian randomization that accounts for all genetic variants without instability or inflated error rates.
Findings
Method is robust to variant selection and genetic correlation matrix variations.
Approach reduces Type 1 error rates compared to traditional methods.
Application to testosterone variants illustrates improved inference consistency.
Abstract
Mendelian randomization uses genetic variants to make causal inferences about the effect of a risk factor on an outcome. With fine-mapped genetic data, there may be hundreds of genetic variants in a single gene region any of which could be used to assess this causal relationship. However, using too many genetic variants in the analysis can lead to spurious estimates and inflated Type 1 error rates. But if only a few genetic variants are used, then the majority of the data is ignored and estimates are highly sensitive to the particular choice of variants. We propose an approach based on summarized data only (genetic association and correlation estimates) that uses principal components analysis to form instruments. This approach has desirable theoretical properties: it takes the totality of data into account and does not suffer from numerical instabilities. It also has good properties in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
