Variable-Order de Bruijn Graphs
Christina Boucher, Alex Bowe, Travis Gagie, Simon J. Puglisi, Kunihiko, Sadakane

TL;DR
This paper introduces a space-efficient data structure that allows dynamic changing of the order in de Bruijn graphs, enabling the representation of multiple graph orders simultaneously for genome assembly.
Contribution
It extends a succinct de Bruijn graph representation to support variable order operations, reducing memory and computational overhead compared to building multiple fixed-order graphs.
Findings
Supports changing graph order on the fly
Modest increase in space and time complexity
Efficiently represents all graphs up to a maximum order
Abstract
The de Bruijn graph of a set of strings is a key data structure in genome assembly that represents overlaps between all the -length substrings of . Construction and navigation of the graph is a space and time bottleneck in practice and the main hurdle for assembling large, eukaryote genomes. This problem is compounded by the fact that state-of-the-art assemblers do not build the de Bruijn graph for a single order (value of ) but for multiple values of . More precisely, they build de Bruijn graphs, each with a specific order, i.e., . Although, this paradigm increases the quality of the assembly produced, it increases the memory by a factor of in most cases. In this paper, we show how to augment a succinct de Bruijn graph representation by Bowe et al. (Proc. WABI, 2012) to support new operations that let us change order on the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenomics and Phylogenetic Studies · Algorithms and Data Compression · Chromosomal and Genetic Variations
