A Weighted U Statistic for Genetic Association Analyses of Sequencing Data
Changshuai Wei, Ming Li, Zihuai He, Olga Vsevolozhskaya, Daniel J. Schaid, and Qing Lu

TL;DR
This paper introduces WU-seq, a non-parametric weighted U statistic method for genetic association analysis of high-dimensional sequencing data, demonstrating improved robustness and comparable performance to existing methods.
Contribution
The paper presents WU-seq, a novel non-parametric method that enhances association analysis of sequencing data without assuming specific phenotype distributions.
Findings
WU-seq outperforms SKAT when assumptions are violated
WU-seq performs comparably to SKAT under correct assumptions
Identified association between ANGPTL4 and VLDL cholesterol in DHS data
Abstract
With advancements in next generation sequencing technology, a massive amount of sequencing data are generated, offering a great opportunity to comprehensively investigate the role of rare variants in the genetic etiology of complex diseases. Nevertheless, this poses a great challenge for the statistical analysis of high-dimensional sequencing data. The association analyses based on traditional statistical methods suffer substantial power loss because of the low frequency of genetic variants and the extremely high dimensionality of the data. We developed a weighted U statistic, referred to as WU-seq, for the high-dimensional association analysis of sequencing data. Based on a non-parametric U statistic, WU-SEQ makes no assumption of the underlying disease model and phenotype distribution, and can be applied to a variety of phenotypes. Through simulation studies and an empirical study, we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
