An Efficient Algorithm For Chinese Postman Walk on Bi-directed de Bruijn Graphs
Vamsi Kundeti, Sanguthevar Rajasekaran, Hieu Dinh

TL;DR
This paper introduces a faster algorithm for solving the cyclic Chinese Postman problem on bi-directed de Bruijn graphs, which improves efficiency in genome sequence assembly by avoiding flow reduction techniques.
Contribution
It presents a novel algorithm that solves the cyclic CPP on weighted bi-directed de Bruijn graphs more efficiently than previous flow-based methods, especially when imbalance is low.
Findings
The new algorithm has a time complexity of ?(p(|V| + |E|) log(|V|) + (dmaxp)^3).
Experimental results show p/|V| between 0.08% and 0.13% in datasets.
The algorithm outperforms bidirected flow algorithms when p is much less than |V|.
Abstract
Sequence assembly from short reads is an important problem in biology. It is known that solving the sequence assembly problem exactly on a bi-directed de Bruijn graph or a string graph is intractable. However finding a Shortest Double stranded DNA string (SDDNA) containing all the k-long words in the reads seems to be a good heuristic to get close to the original genome. This problem is equivalent to finding a cyclic Chinese Postman (CP) walk on the underlying un-weighted bi-directed de Bruijn graph built from the reads. The Chinese Postman walk Problem (CPP) is solved by reducing it to a general bi-directed flow on this graph which runs in O(|E|2 log2(|V |)) time. In this paper we show that the cyclic CPP on bi-directed graphs can be solved without reducing it to bi-directed flow. We present a ?(p(|V | + |E|) log(|V |) + (dmaxp)3) time algorithm to solve the cyclic CPP on a weighted…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
