Application of compressed sensing to genome wide association studies and genomic selection
Shashaank Vattikuti, James J. Lee, Christopher C. Chang, Stephen D. H., Hsu, Carson C. Chow

TL;DR
This paper demonstrates that compressed sensing techniques can effectively identify trait-associated genetic loci in genome-wide association studies and genomic selection, even when the number of markers exceeds sample size, with phase transitions indicating successful recovery.
Contribution
It introduces the application of compressed sensing theory and algorithms to GWAS and GS, providing conditions for successful locus identification based on heritability and sample size.
Findings
Successful locus identification depends on heritability and sample size.
Phase transition occurs at certain sample sizes, indicating when true effects are recovered.
For heritability 0.5, a sample size 30 times the number of loci suffices.
Abstract
We show that the signal-processing paradigm known as compressed sensing (CS) is applicable to genome-wide association studies (GWAS) and genomic selection (GS). The aim of GWAS is to isolate trait-associated loci, whereas GS attempts to predict the phenotypic values of new individuals on the basis of training data. CS addresses a problem common to both endeavors, namely that the number of genotyped markers often greatly exceeds the sample size. We show using CS methods and theory that all loci of nonzero effect can be identified (selected) using an efficient algorithm, provided that they are sufficiently few in number (sparse) relative to sample size. For heritability h2 = 1, there is a sharp phase transition to complete selection as the sample size is increased. For heritability values less than one, complete selection can still occur although the transition is smoothed. The transition…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGene expression and cancer classification · Genomic variations and chromosomal abnormalities · RNA and protein synthesis mechanisms
