Haplotype Inference on Pedigrees with Recombinations, Errors, and Missing Genotypes via SAT solvers
Yuri Pirola, Gianluca Della Vedova, Stefano Biffani, Alessandra Stella, and Paola Bonizzoni

TL;DR
This paper introduces HCRE, a generalized haplotype inference model that accounts for errors and missing data, and presents a SAT solver-based exact algorithm demonstrating practical effectiveness on simulated and real datasets.
Contribution
It extends the MRHC model to include errors and missing genotypes and develops a SAT-based exact algorithm for this more realistic and complex problem.
Findings
The SAT-based algorithm effectively solves HCRE instances.
HCRE provides more biologically accurate haplotype inference.
Experimental results show high accuracy and performance on real and simulated data.
Abstract
The Minimum-Recombinant Haplotype Configuration problem (MRHC) has been highly successful in providing a sound combinatorial formulation for the important problem of genotype phasing on pedigrees. Despite several algorithmic advances and refinements that led to some efficient algorithms, its applicability to real datasets has been limited by the absence of some important characteristics of these data in its formulation, such as mutations, genotyping errors, and missing data. In this work, we propose the Haplotype Configuration with Recombinations and Errors problem (HCRE), which generalizes the original MRHC formulation by incorporating the two most common characteristics of real data: errors and missing genotypes (including untyped individuals). Although HCRE is computationally hard, we propose an exact algorithm for the problem based on a reduction to the well-known Satisfiability…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenetic Associations and Epidemiology · Genetic and phenotypic traits in livestock · Cholesterol and Lipid Metabolism
