Alternative Methods for H1 Simulations in Genome Wide Association Studies
Vittorio Perduca, Christine Sinoquet, Raphael Mourad, Gregory, Nuel

TL;DR
This paper introduces three efficient algorithms for simulating phenotypes under H1 in GWAS, avoiding genotype regeneration, and validates their accuracy and speed compared to existing methods like Hapgen.
Contribution
The authors propose and validate three novel algorithms for phenotype simulation in GWAS that are faster and equally accurate compared to traditional genotype-based methods.
Findings
Backward sampling is significantly faster than rejection and MCMC algorithms.
The new algorithms produce consistent results with Hapgen in realistic datasets.
Disease prevalence impacts GWAS power more than previously understood.
Abstract
Assessing the statistical power to detect susceptibility variants plays a critical role in GWA studies both from the prospective and retrospective points of view. Power is empirically estimated by simulating phenotypes under a disease model H1. For this purpose, the "gold" standard consists in simulating genotypes given the phenotypes (e.g. Hapgen). We introduce here an alternative approach for simulating phenotypes under H1 that does not require generating new genotypes for each simulation. In order to simulate phenotypes with a fixed total number of cases and under a given disease model, we suggest three algorithms: i) a simple rejection algorithm; ii) a numerical Markov Chain Monte-Carlo (MCMC) approach; iii) and an exact and efficient backward sampling algorithm. In our study, we validated the three algorithms both on a toy-dataset and by comparing them with Hapgen on a more…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenetic Associations and Epidemiology · Bioinformatics and Genomic Networks · Genetic Mapping and Diversity in Plants and Animals
