SLIQ: Simple Linear Inequalities for Efficient Contig Scaffolding
Rajat S. Roy, Kevin C. Chen, Anirvan M. Sengupta, Alexander Schliep

TL;DR
SLIQ introduces simple geometric inequalities to improve the accuracy and efficiency of contig scaffolding in genome assembly, outperforming traditional methods especially on complex genomes like human.
Contribution
It presents a novel set of linear inequalities for predicting contig positions and orientations, serving as a preprocessing step to enhance scaffolding algorithms.
Findings
SLIQ achieves high accuracy in contig position and orientation prediction.
SLIQ outperforms majority voting in complex genome scaffolding.
The proposed scaffolding algorithm is more efficient than existing methods.
Abstract
Scaffolding is an important subproblem in "de novo" genome assembly in which mate pair data are used to construct a linear sequence of contigs separated by gaps. Here we present SLIQ, a set of simple linear inequalities derived from the geometry of contigs on the line that can be used to predict the relative positions and orientations of contigs from individual mate pair reads and thus produce a contig digraph. The SLIQ inequalities can also filter out unreliable mate pairs and can be used as a preprocessing step for any scaffolding algorithm. We tested the SLIQ inequalities on five real data sets ranging in complexity from simple bacterial genomes to complex mammalian genomes and compared the results to the majority voting procedure used by many other scaffolding algorithms. SLIQ predicted the relative positions and orientations of the contigs with high accuracy in all cases and gave…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
