Model-based clustering for identifying disease-associated SNPs in   case-control genome-wide association studies

Yan Xu; Li Xing; Jessica Su; Xuekui Zhang; Weiliang Qiu

arXiv:1806.08456·stat.ME·September 25, 2019

Model-based clustering for identifying disease-associated SNPs in case-control genome-wide association studies

Yan Xu, Li Xing, Jessica Su, Xuekui Zhang, Weiliang Qiu

PDF

TL;DR

This paper introduces a model-based clustering method for GWAS data that improves detection of disease-associated SNPs by leveraging information across SNPs and controlling false discovery rates more effectively.

Contribution

The authors propose a novel clustering approach that transforms high-dimensional GWAS analysis into a more manageable form, outperforming traditional methods in simulations and real data analysis.

Findings

01

Better control of false discovery rate (FDR) in simulations

02

Detection of known and novel SNPs in real GWAS data

03

Outperforms traditional SNP-wise approach in sensitivity

Abstract

Genome-wide association studies (GWASs) aim to detect genetic risk factors for complex human diseases by identifying disease-associated single-nucleotide polymorphisms (SNPs). The traditional SNP-wise approach along with multiple testing adjustment is over-conservative and lack of power in many GWASs. In this article, we proposed a model-based clustering method that transforms the challenging high-dimension-small-sample-size problem to low-dimension-large-sample-size problem and borrows information across SNPs by grouping SNPs into three clusters. We pre-specify the patterns of clusters by minor allele frequencies of SNPs between cases and controls, and enforce the patterns with prior distributions. In the simulation studies our proposed novel model outperform traditional SNP-wise approach by showing better controls of false discovery rate (FDR) and higher sensitivity. We re-analyzed…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.