TL;DR
This paper introduces an iterative hard thresholding algorithm for GWAS that improves model selection accuracy and computational efficiency, enabling analysis on standard desktop computers.
Contribution
The paper presents a novel IHT algorithm tailored for GWAS, enhancing model selection accuracy and computational feasibility compared to existing penalized regression methods.
Findings
IHT reduces false positives and negatives in GWAS.
IHT is computationally competitive with penalized regression methods.
Parallel implementation allows analysis on commodity hardware.
Abstract
A genome-wide association study (GWAS) correlates marker variation with trait variation in a sample of individuals. Each study subject is genotyped at a multitude of SNPs (single nucleotide polymorphisms) spanning the genome. Here we assume that subjects are unrelated and collected at random and that trait values are normally distributed or transformed to normality. Over the past decade, researchers have been remarkably successful in applying GWAS analysis to hundreds of traits. The massive amount of data produced in these studies present unique computational challenges. Penalized regression with LASSO or MCP penalties is capable of selecting a handful of associated SNPs from millions of potential SNPs. Unfortunately, model selection can be corrupted by false positives and false negatives, obscuring the genetic underpinning of a trait. This paper introduces the iterative hard…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
