Detecting Superbubbles in Assembly Graphs
Taku Onodera, Kunihiko Sadakane, and Tetsuo Shibuya

TL;DR
This paper introduces the concept of superbubbles in assembly graphs, along with an efficient algorithm for their detection, improving analysis of complex graph structures in genome assembly.
Contribution
It presents the novel concept of superbubbles and an average-case linear time algorithm for their detection in assembly graphs.
Findings
Algorithm runs in linear time on average for practical graphs
Efficient detection of complex graph structures like superbubbles
Applicable to large-scale genome assembly graphs
Abstract
We introduce a new concept of a subgraph class called a superbubble for analyzing assembly graphs, and propose an efficient algorithm for detecting it. Most assembly algorithms utilize assembly graphs like the de Bruijn graph or the overlap graph constructed from reads. From these graphs, many assembly algorithms first detect simple local graph structures (motifs), such as tips and bubbles, mainly to find sequencing errors. These motifs are easy to detect, but they are sometimes too simple to deal with more complex errors. The superbubble is an extension of the bubble, which is also important for analyzing assembly graphs. Though superbubbles are much more complex than ordinary bubbles, we show that they can be efficiently enumerated. We propose an average-case linear time algorithm (i.e., O(n+m) for a graph with n vertices and m edges) for graphs with a reasonable model, though the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenomics and Chromatin Dynamics · Genomics and Phylogenetic Studies · RNA and protein synthesis mechanisms
