Minimum Segmentation for Pan-genomic Founder Reconstruction in Linear Time
Tuukka Norri, Bastien Cazaux, Dmitry Kosolobov, Veli M\"akinen

TL;DR
This paper presents an optimal linear-time algorithm for the minimum segmentation problem in pan-genomic founder reconstruction, significantly improving efficiency for large-scale haplotype datasets.
Contribution
The authors develop an $O(mn)$ time algorithm for founder sequence segmentation, advancing beyond previous $O(mn^2)$ solutions, enabling practical analysis of complete human chromosomes.
Findings
Achieved linear-time complexity for the segmentation problem.
Demonstrated applicability to large-scale pan-genomic data.
Facilitated efficient reference set construction for genomics applications.
Abstract
Given a threshold and a set of haplotype sequences, each having length , the minimum segmentation problem for founder reconstruction is to partition the sequences into disjoint segments , where and is the set , such that the length of each segment, , is at least and is minimized. The distinct substrings in the segments represent founder blocks that can be concatenated to form founder sequences representing the original such that crossovers happen only at segment boundaries. We give an optimal time algorithm…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
