Bayesian variable selection regression for genome-wide association studies and other large-scale problems
Yongtao Guan, Matthew Stephens

TL;DR
This paper explores the application of Bayesian Variable Selection Regression (BVSR) to large-scale genome-wide association studies, highlighting its interpretability and ability to estimate heritability, especially when many effects are tiny.
Contribution
It introduces tailored prior specifications for BVSR in GWAS, emphasizing variance explained and addressing the challenge of tiny effects in variable selection.
Findings
BVSR provides interpretable confidence measures for variable inclusion.
BVSR can estimate the total proportion of variance explained by genetic variants.
Application to GWAS data sheds light on missing heritability issues.
Abstract
We consider applying Bayesian Variable Selection Regression, or BVSR, to genome-wide association studies and similar large-scale regression problems. Currently, typical genome-wide association studies measure hundreds of thousands, or millions, of genetic variants (SNPs), in thousands or tens of thousands of individuals, and attempt to identify regions harboring SNPs that affect some phenotype or outcome of interest. This goal can naturally be cast as a variable selection regression problem, with the SNPs as the covariates in the regression. Characteristic features of genome-wide association studies include the following: (i) a focus primarily on identifying relevant variables, rather than on prediction; and (ii) many relevant covariates may have tiny effects, making it effectively impossible to confidently identify the complete "correct" subset of variables. Taken together, these…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
