Haplotype frequency inference from pooled genetic data with a latent multinomial model
Yong See Foo, Jennifer A. Flegg

TL;DR
This paper introduces exact inference methods for estimating haplotype frequencies from pooled genetic data using a latent multinomial model, addressing limitations of normal approximation methods especially in large-scale studies.
Contribution
The authors develop exact, scalable algorithms based on a latent multinomial model, improving accuracy over existing approximate methods for pooled genetic data analysis.
Findings
Exact methods outperform approximate ones on synthetic data.
Latent count sampling via Markov bases is computationally efficient.
Methods are applicable to time-series and hierarchical genetic data.
Abstract
In genetic studies, haplotype data provide more refined information than data about separate genetic markers. However, large-scale studies that genotype hundreds to thousands of individuals may only provide results of pooled data, where only the total allele counts of each marker in each pool are reported. Methods for inferring haplotype frequencies from pooled genetic data that scale well with pool size rely on a normal approximation, which we observe to produce unreliable inference when applied to real data. We illustrate cases where the approximation breaks down, due to the normal covariance matrix being near-singular. As an alternative to approximate methods, in this paper we propose exact methods to infer haplotype frequencies from pooled genetic data based on a latent multinomial model, where the observed allele counts are considered integer combinations of latent, unobserved…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenetic Associations and Epidemiology · Gene expression and cancer classification · Genetic Mapping and Diversity in Plants and Animals
