Haplotype Inference for Pedigrees with Few Recombinations

Bonnie Kirkpatrick

arXiv:1602.04270·cs.DS·February 16, 2016

Haplotype Inference for Pedigrees with Few Recombinations

Bonnie Kirkpatrick

PDF

Open Access

TL;DR

This paper presents a novel algorithm for haplotype inference in pedigrees that minimizes recombinations, efficiently handling large datasets despite the NP-hard nature of the problem.

Contribution

It formulates haplotype inference as a graph optimization problem and introduces a tailored algorithm with practical running time for small numbers of recombinations.

Findings

01

Algorithm runs in polynomial time for small k

02

Efficiently handles large pedigrees and many sites

03

Practical relevance due to typical low recombination rates

Abstract

Pedigrees, or family trees, are graphs of family relationships that are used to study inheritance. A fundamental problem in computational biology is to find, for a pedigree with $n$ individuals genotyped at every site, a set of Mendelian-consistent haplotypes that have the minimum number of recombinations. This is an NP-hard problem and some pedigrees can have thousands of individuals and hundreds of thousands of sites. This paper formulates this problem as a optimization on a graph and introduces a tailored algorithm with a running time of O(n^{(k+2)}m^{6k}) for n individuals, m sites, and k recombinations. Since there are generally only 1-2 recombinations per chromosome in each meiosis, k is small enough to make this algorithm practically relevant.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenetic Associations and Epidemiology · Genomic variations and chromosomal abnormalities · Genetic Mapping and Diversity in Plants and Animals