Learning the optimal scale for GWAS through hierarchical SNP aggregation
Florent Guinot, Marie Szafranski, Christophe Ambroise (LaMME), Franck, Samson (MIG)

TL;DR
This paper introduces a hierarchical SNP aggregation method leveraging haplotype structures to improve the precision of detecting genetic associations in GWAS, outperforming traditional univariate approaches.
Contribution
It presents a novel dimension-reduction technique for GWAS that enhances association detection accuracy by aggregating SNPs based on haplotype information.
Findings
Improved precision in identifying true genetic associations.
Outperforms standard univariate and multivariate methods.
Effective on both synthetic and real GWAS data.
Abstract
Motivation: Genome-Wide Association Studies (GWAS) seek to identify causal genomic variants associated with rare human diseases. The classical statistical approach for detecting these variants is based on univariate hypothesis testing, with healthy individuals being tested against affected individuals at each locus. Given that an individual's genotype is characterized by up to one million SNPs, this approach lacks precision, since it may yield a large number of false positives that can lead to erroneous conclusions about genetic associations with the disease. One way to improve the detection of true genetic associations is to reduce the number of hypotheses to be tested by grouping SNPs. Results: We propose a dimension-reduction approach which can be applied in the context of GWAS by making use of the haplotype structure of the human genome. We compare our method with standard…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
