Divide and conquer approach for genome-wide association studies
Mustafa İsmail Özkaraca, Mulya Agung, Pau Navarro, Albert Tenesa

TL;DR
This paper introduces a faster and more efficient method for genome-wide association studies by splitting data into smaller parts and combining results.
Contribution
A novel divide-and-conquer GWAS pipeline that reduces computational costs while maintaining accuracy and handling population structure.
Findings
The pipeline achieves same discovery levels as standard GWAS with reduced computational costs.
Effectively handles related individuals and controls inflated effect sizes in real datasets.
Supports incremental analysis as new samples are added and improves reproducibility.
Abstract
Genome-wide association studies (GWAS) are computationally intensive, requiring significant time and resources with computational complexity scaling at least linearly with sample size. Here, we present an accurate and resource-efficient pipeline for GWAS that mitigates the impact of sample size on computational demands. Our approach involves (1) randomly partitioning the cohort into equally sized sub-cohorts, (2) conducting independent GWAS within each sub-cohort, and (3) integrating the results using a novel meta-analysis technique that accounts for population structure and other confounders between sub-cohorts. Importantly, we demonstrate through simulations and real-data examples in humans that our approach effectively manages analyzing related individuals, a critical factor in real datasets, while controlling for inflated effect sizes, a phenomenon known as winner's curse. We show…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3
Figure 4Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenetic Associations and Epidemiology · Genetic Mapping and Diversity in Plants and Animals · Genetic and phenotypic traits in livestock
