Pure Parsimony Xor Haplotyping
Paola Bonizzoni, Gianluca Della Vedova, Riccardo Dondi, Yuri Pirola,, Romeo Rizzi

TL;DR
This paper introduces a new model for haplotype inference from xor-genotype data using pure parsimony, providing exact algorithms, approximation methods, and a scalable heuristic validated on real-world data.
Contribution
It formulates xor-genotype haplotyping as a pure parsimony problem and offers polynomial algorithms for special cases, a fixed-parameter algorithm, and a scalable heuristic.
Findings
Exact polynomial-time algorithms for restricted cases
A polynomial-time k-approximation algorithm
A heuristic that scales to large real-world datasets
Abstract
The haplotype resolution from xor-genotype data has been recently formulated as a new model for genetic studies. The xor-genotype data is a cheaply obtainable type of data distinguishing heterozygous from homozygous sites without identifying the homozygous alleles. In this paper we propose a formulation based on a well-known model used in haplotype inference: pure parsimony. We exhibit exact solutions of the problem by providing polynomial time algorithms for some restricted cases and a fixed-parameter algorithm for the general case. These results are based on some interesting combinatorial properties of a graph representation of the solutions. Furthermore, we show that the problem has a polynomial time k-approximation, where k is the maximum number of xor-genotypes containing a given SNP. Finally, we propose a heuristic and produce an experimental analysis showing that it scales to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
