Orienting Ordered Scaffolds: Complexity and Algorithms
Sergey Aganezov, Pavel Avdeyev, Nikita Alexeev, Yongwu Rong, Max A., Alekseyev

TL;DR
This paper studies the problem of orienting ordered genomic scaffolds, formalizes it as an optimization problem, proves NP-hardness, and proposes algorithms for special and general cases to improve genome assembly accuracy.
Contribution
It formalizes the scaffold orientation problem, proves its NP-hardness, and introduces polynomial and fixed-parameter tractable algorithms for solving it.
Findings
The scaffold orientation problem is NP-hard.
A polynomial-time algorithm solves a special case with limited constraints.
An FPT algorithm addresses the general case effectively.
Abstract
Despite the recent progress in genome sequencing and assembly, many of the currently available assembled genomes come in a draft form. Such draft genomes consist of a large number of genomic fragments (scaffolds), whose order and/or orientation (i.e., strand) in the genome are unknown. There exist various scaffold assembly methods, which attempt to determine the order and orientation of scaffolds along the genome chromosomes. Some of these methods (e.g., based on FISH physical mapping, chromatin conformation capture, etc.) can infer the order of scaffolds, but not necessarily their orientation. This leads to a special case of the scaffold orientation problem (i.e., deducing the orientation of each scaffold) with a known order of the scaffolds. We address the problem of orientating ordered scaffolds as an optimization problem based on given weighted orientations of scaffolds and their…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
