Methodological Issues in Multistage Genome-Wide Association Studies
Duncan C. Thomas, Graham Casey, David V. Conti, Robert W. Haile, Juan, Pablo Lewinger, Daniel O. Stram

TL;DR
This paper discusses the advantages and limitations of two-stage genome-wide association studies, highlighting cost savings, optimal design considerations, and the shift towards single-stage high-density genotyping due to declining costs.
Contribution
It provides an analysis of the cost-effectiveness and methodological considerations of two-stage GWAS designs and clarifies misconceptions about their role in discovery versus replication.
Findings
Two-stage GWAS can save about 50% in costs with comparable power.
Declining genotyping costs lead to a shift towards single-stage designs.
Two-stage design is not a replication but a discovery method.
Abstract
Because of the high cost of commercial genotyping chip technologies, many investigations have used a two-stage design for genome-wide association studies, using part of the sample for an initial discovery of ``promising'' SNPs at a less stringent significance level and the remainder in a joint analysis of just these SNPs using custom genotyping. Typical cost savings of about 50% are possible with this design to obtain comparable levels of overall type I error and power by using about half the sample for stage I and carrying about 0.1% of SNPs forward to the second stage, the optimal design depending primarily upon the ratio of costs per genotype for stages I and II. However, with the rapidly declining costs of the commercial panels, the generally low observed ORs of current studies, and many studies aiming to test multiple hypotheses and multiple endpoints, many investigators are…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
