A Generalized Similarity U Test for Multivariate Analysis of Sequencing Data
Changshuai Wei, Qing Lu

TL;DR
This paper introduces GSU, a generalized similarity U test designed for multivariate analysis of high-dimensional sequencing data, addressing challenges of phenotype diversity and low-frequency variants in genetic association studies.
Contribution
The paper presents GSU, a novel similarity-based test that handles high-dimensional genotypes and phenotypes, with theoretical properties and practical advantages demonstrated through simulations and real data.
Findings
GSU outperforms existing methods in power and robustness
GSU effectively analyzes multiple phenotypes with different distributions
Identified joint gene associations in the Dallas Heart Study
Abstract
Sequencing-based studies are emerging as a major tool for genetic association studies of complex diseases. These studies pose great challenges to the traditional statistical methods (e.g., single-locus analyses based on regression methods) because of the high-dimensionality of data and the low frequency of genetic variants. In addition, there is a great interest in biology and epidemiology to identify genetic risk factors contributed to multiple disease phenotypes. The multiple phenotypes can often follow different distributions, which violates the assumptions of most current methods. In this paper, we propose a generalized similarity U test, referred to as GSU. GSU is a similarity-based test and can handle high-dimensional genotypes and phenotypes. We studied the theoretical properties of GSU, and provided the efficient p-value calculation for association test as well as the sample…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenetic Associations and Epidemiology · Liver Disease Diagnosis and Treatment · Gene expression and cancer classification
