Ultrabubble enumeration via a lowest common ancestor approach
Athanasios E. Zisis, P{\aa}l S{\ae}trom

TL;DR
This paper introduces an efficient method for enumerating ultrabubbles in variation graphs using lowest common ancestor queries, significantly improving runtime over previous approaches especially in graphs with few cycles.
Contribution
It presents a novel transformation of bidirected graphs to bipartite biedged graphs enabling LCA queries for ultrabubble detection, with an O(Kn) algorithm for all ultrabubbles.
Findings
Improved runtime in graphs with few cycles and dead ends.
Effective in dense graphs with many edges.
Benchmark results show faster ultrabubble enumeration.
Abstract
Pangenomics uses graph-based models to represent and study the genetic variation between individuals of the same species or between different species. In such variation graphs, a path through the graph represents one individual genome. Subgraphs that encode locally distinct paths are therefore genomic regions with distinct genetic variation and detecting such subgraphs is integral for studying genetic variation. Biedged graphs is a type of variation graph that use two types of edges, black and grey, to represent genomic sequences and adjacencies between sequences, respectively. Ultrabubbles in biedged graphs are minimal subgraphs that represent a finite set of sequence variants that all start and end with two distinct sequences; that is, ultrabubbles are acyclic and all paths in an ultrabubble enter and exit through two distinct black edges. Ultrabubbles are therefore a special case of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenome Rearrangement Algorithms · Genomics and Phylogenetic Studies · Genetic Associations and Epidemiology
