HAP-SAMPLE2: data-based resampling for association studies with admixture
George Sun, Bryan W Ting, Fred A Wright, Yi-Hui Zhou

TL;DR
HAP-SAMPLE2 is a tool for simulating genetic data that handles admixture and rare variants, useful for large-scale genetic studies.
Contribution
It introduces features for population admixture and rare variant analysis in genotype-phenotype simulations.
Findings
HAP-SAMPLE2 supports the simulation of admixed populations and rare variants with customizable parameters.
The tool enables efficient creation of complex datasets for projects like the 1000 Genomes Project.
It includes burden testing for rare variants using multiple weighting schemes.
Abstract
HAP-SAMPLE2 extends the functionality of the original HAP-SAMPLE tool for simulating genotype-phenotype data, now with features to handle population admixture and rare variant analysis. It allows users to define parameters such as disease prevalence and allele effect sizes for both common and rare variant simulations. HAP-SAMPLE2 provides an efficient means for simulating complex datasets, suitable for large-scale projects like the 1000 Genomes Project. Its capabilities for population admixture allow users to create admixed populations or preserve substructures while introducing novel variation through artificial recombination. Additionally, the tool supports burden testing for rare variants using fixed and Madsen-Browning weighting schemes. The software, along with a detailed vignette, is available on GitHub: https://github.com/M3dical/HAPSAMPLE2.
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Mining Algorithms and Applications
