A Weighted U Statistic for Genetic Association Analyses of Sequencing Data

Changshuai Wei; Ming Li; Zihuai He; Olga Vsevolozhskaya; Daniel J. Schaid; and Qing Lu

arXiv:1505.01204·stat.ME·August 18, 2025

A Weighted U Statistic for Genetic Association Analyses of Sequencing Data

Changshuai Wei, Ming Li, Zihuai He, Olga Vsevolozhskaya, Daniel J. Schaid, and Qing Lu

PDF

TL;DR

This paper introduces WU-seq, a non-parametric weighted U statistic method for genetic association analysis of high-dimensional sequencing data, demonstrating improved robustness and comparable performance to existing methods.

Contribution

The paper presents WU-seq, a novel non-parametric method that enhances association analysis of sequencing data without assuming specific phenotype distributions.

Findings

01

WU-seq outperforms SKAT when assumptions are violated

02

WU-seq performs comparably to SKAT under correct assumptions

03

Identified association between ANGPTL4 and VLDL cholesterol in DHS data

Abstract

With advancements in next generation sequencing technology, a massive amount of sequencing data are generated, offering a great opportunity to comprehensively investigate the role of rare variants in the genetic etiology of complex diseases. Nevertheless, this poses a great challenge for the statistical analysis of high-dimensional sequencing data. The association analyses based on traditional statistical methods suffer substantial power loss because of the low frequency of genetic variants and the extremely high dimensionality of the data. We developed a weighted U statistic, referred to as WU-seq, for the high-dimensional association analysis of sequencing data. Based on a non-parametric U statistic, WU-SEQ makes no assumption of the underlying disease model and phenotype distribution, and can be applied to a variety of phenotypes. Through simulation studies and an empirical study, we…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.