FDR controlling procedures with dimension reduction and their application to GWAS with linkage disequilibrium score
Dayeon Jung, Yewon Kim, Junyong Park

TL;DR
This paper introduces dimension reduction techniques, specifically PCA, combined with LD scores to improve FDR control and power in GWAS, addressing the missing heritability problem with practical, interpretable methods.
Contribution
It proposes novel FDR control methods using PCA and LD scores in GWAS, enhancing power and interpretability over existing approaches.
Findings
PCA-based methods improve FDR control in high-dimensional GWAS.
Incorporating LD scores as covariates increases statistical power.
Methods demonstrate effectiveness on real GWAS datasets with BMI.
Abstract
Genome-wide association studies (GWAS) have led to the discovery of numerous single nucleotide polymorphisms (SNPs) associated with various phenotypes and complex diseases. However, the identified genetic variants do not fully explain the heritability of complex traits, known as the missing heritability problem. To address this challenge and accurately control false positives while maximizing true associations, we propose two approaches involving linkage disequilibrium (LD) scores as covariates. We apply principal component analysis (PCA), one of the dimensionality reduction techniques, to control the False Discovery Rate (FDR) in the presence of high-dimensional covariates. This method not only provides a convenient interpretation of how multiple covariates in high dimensions affect the control of FDR but also offers higher statistical power compared to cases where covariates are not…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenetic Associations and Epidemiology · Advanced Causal Inference Techniques · Bioinformatics and Genomic Networks
