# gscramble: Simulation of Admixed Individuals Without Reuse of Genetic Material

**Authors:** Eric C. Anderson, Rachael M. Giglio, Matthew G. DeSaix, Timothy J. Smyser

PMC · DOI: 10.1111/1755-0998.14069 · Molecular Ecology Resources · 2025-01-12

## TL;DR

This paper introduces gscramble, a new simulation tool that avoids overestimating genetic clustering accuracy by simulating admixed individuals without reusing genetic material.

## Contribution

gscramble introduces a novel simulation framework that prevents resampling-induced power inflation by simulating genotypes without replacement and using species-specific recombination rates.

## Key findings

- Sampling with replacement in genetic simulations can lead to spurious power inflation.
- gscramble simulates admixed individuals while preserving haplotype structure and avoiding RISPI.
- The tool allows users to define complex pedigrees and track haplotype blocks from different populations.

## Abstract

While a best practice for evaluating the behaviour of genetic clustering algorithms on empirical data is to conduct parallel analyses on simulated data, these types of simulation techniques often involve sampling genetic data with replacement. In this paper we demonstrate that sampling with replacement, especially with large marker sets, inflates the perceived statistical power to correctly assign individuals (or the alleles that they carry) back to source populations—a phenomenon we refer to as resampling‐induced, spurious power inflation (RISPI). To address this issue, we present gscramble, a simulation approach in R for creating biologically informed individual genotypes from empirical data that: (1) samples alleles from populations without replacement and (2) segregates alleles based on species‐specific recombination rates. This framework makes it possible to simulate admixed individuals in a way that respects the physical linkage between markers on the same chromosome and which does not suffer from RISPI. This is achieved in gscramble by allowing users to specify pedigrees of varying complexity in order to simulate admixed genotypes, segregating and tracking haplotype blocks from different source populations through those pedigrees, and then sampling—using a variety of permutation schemes—alleles from empirical data into those haplotype blocks. We demonstrate the functionality of gscramble with both simulated and empirical data sets and highlight additional uses of the package that users may find valuable.

## Full-text entities

- **Genes:** Pop1 [NCBI Gene 100154374]
- **Diseases:** FALSE (MESH:D017541), HD (MESH:D006816), GSP (MESH:D042822)
- **Species:** Sus scrofa (pig, species) [taxon 9823], Salmo trutta (river trout, species) [taxon 8032], Oncorhynchus mykiss (rainbow trout, species) [taxon 8022], Oncorhynchus clarkii (cutthroat trout, species) [taxon 30962], Salmo salar (Atlantic salmon, species) [taxon 8030]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC11969638/full.md

## Figures

8 figures with captions in the complete paper: https://tomesphere.com/paper/PMC11969638/full.md

## References

24 references — full list in the complete paper: https://tomesphere.com/paper/PMC11969638/full.md

---
Source: https://tomesphere.com/paper/PMC11969638