Identifying Single-Origin Rare Variants in Population Genomic Data
Josh J Reynolds, Vassiliki Koufopanou, Austin Burt

TL;DR
This paper introduces methods to identify rare genetic variants with single origins in population genomic data, improving accuracy in demographic and genetic analyses.
Contribution
The paper presents novel methods to estimate and identify single-origin doubletons in population genomic datasets.
Findings
Approximately 16% of doubletons in Anopheles gambiae data had independent origins.
A subset of doubletons with ∼99% confidence of single origin was identified.
The methods were validated using data analyses and coalescent simulations.
Abstract
Genomic analyses have shown that some mutations in large population genomic datasets may be the result of repeated, independent events at the same locus. However, the possibility of recurrent mutation is often ignored, even when it has the potential to introduce errors, such as when assuming co-ancestry for demographic analysis. Even rare variants such as doubletons, which should be particularly informative about recent demography, may have multiple origins despite arising relatively recently in the population. Here, we develop methods to (i) estimate the frequency of recurrent doubletons in a population genomic dataset from the occurrence of tri-allelic sites with two different singleton mutations and (ii) identify a subset of high confidence single-origin doubletons based on the presence of a linked rare variant on the surrounding shared haplotype. Applying these methods to data for…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7
Figure 8Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMalaria Research and Control · Genetic diversity and population structure · Genetic Associations and Epidemiology
