Matrix Completion and Performance Guarantees for Single Individual Haplotyping
Somsubhra Barik, Haris Vikalo

TL;DR
This paper introduces a matrix factorization approach with theoretical guarantees for the challenging problem of single individual haplotyping, demonstrating improved accuracy over existing methods on real and synthetic data.
Contribution
It proposes a novel binary matrix factorization method with convergence analysis and error bounds for haplotyping, advancing computational techniques in genomics.
Findings
Outperforms existing haplotyping methods on real datasets
Provides theoretical bounds for haplotype reconstruction error
Analyzes convergence properties of the proposed algorithm
Abstract
Single individual haplotyping is an NP-hard problem that emerges when attempting to reconstruct an organism's inherited genetic variations using data typically generated by high-throughput DNA sequencing platforms. Genomes of diploid organisms, including humans, are organized into homologous pairs of chromosomes that differ from each other in a relatively small number of variant positions. Haplotypes are ordered sequences of the nucleotides in the variant positions of the chromosomes in a homologous pair; for diploids, haplotypes associated with a pair of chromosomes may be conveniently represented by means of complementary binary sequences. In this paper, we consider a binary matrix factorization formulation of the single individual haplotyping problem and efficiently solve it by means of alternating minimization. We analyze the convergence properties of the alternating minimization…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
