Necessary and sufficient conditions for identifiability in the admixture model
Jan van Waaij

TL;DR
This paper establishes precise conditions under which the ancestral allele frequencies and admixture proportions can be uniquely identified in the admixture model, correcting previous proofs and extending the theory.
Contribution
It provides a corrected proof of identifiability conditions, introduces new necessary and sufficient criteria, and extends results to non-admixed populations.
Findings
Anchor condition plus independence ensures identifiability.
Corrected proof of necessary conditions for identifiability.
Extended conditions to non-admixed populations.
Abstract
We consider M SNP data from N individuals who are an admixture of K unknown ancient populations. Let be the frequency of the reference allele of individual i at SNP s. So the number of reference alleles at SNP s for a diploid individual is binomially distributed with parameters 2 and . We suppose , where is the allele frequency of SNP s in population k and is the proportion of population k in the ancestry of individual i. I am interested in the identifiability of F and Q, up to a relabelling of the ancient populations. Under what conditions, when are and and and equal? I show that the anchor condition (Cabreros and Storey, 2019) on one matrix together with an independence condition on the other matrix is sufficient for identifiability. I will argue that the proof of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenetic Associations and Epidemiology · Forensic and Genetic Research · Genomics and Phylogenetic Studies
