Popping Bubbles in Pangenome Graphs
Njagi Mwaniki, Erik Garrison, Nadia Pisanti

TL;DR
This paper introduces flubbles, a new concept for identifying variants in pangenome graphs, along with efficient algorithms and a tool that outperform existing methods in detecting complex genomic structures.
Contribution
The paper presents a novel definition of bubbles called flubbles, along with linear-time algorithms and a hierarchical flubble tree for analyzing genomic variants.
Findings
Povu tool finds flubbles faster than existing tools like vg and BubbleGun.
Povu can detect hairpin inversions, a structure no other tool currently identifies.
The methods are validated on human and yeast genomic data.
Abstract
In this paper, we introduce flubbles, a new definition of "bubbles" corresponding to variants in a (pan)genome graph . We then show a characterization for flubbles in terms of equivalence classes regarding cycles in an intermediate data structure we built from the spanning tree of the , which leads us to a linear time and space solution for finding all flubbles. Furthermore, we show how a related characterization also allows us to efficiently detect what we define as hairpin inversions: a cycle preceded and followed by the same path in the graph; being the latter necessarily traversed both ways, this structure corresponds to inversions. Finally, Inspired by the concept of Program Structure Tree introduced fifty years ago to represent the hierarchy of the control structure of a program, we define a tree representing the structure of G in terms of flubbles, the flubble tree, which…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComplex Network Analysis Techniques · Peer-to-Peer Network Technologies
