Comparative genomics with succinct colored de Bruijn graphs
Lucas P. Ramos, Felipe A. Louza, Guilherme P. Telles

TL;DR
This paper introduces gcBB and mgcBB, novel methods for direct genome comparison using succinct colored de Bruijn graphs, avoiding sequence alignments and leveraging entropy measures for improved phylogenetic analysis.
Contribution
The paper presents gcBB and mgcBB, innovative tools for genome comparison based on succinct colored de Bruijn graphs, enhancing efficiency and accuracy over existing methods.
Findings
gcBB effectively compares draft genomes without alignments.
mgcBB significantly improves computational performance.
Phylogenies derived from gcBB show promising accuracy.
Abstract
DNA technologies have evolved significantly in the past years enabling the sequencing of a large number of genomes in a short time. Nevertheless, the underlying computational problem is hard, and many technical factors and limitations complicate obtaining the complete sequence of a genome. Many genomes are left in a draft state, in which each chromosome is represented by a set of sequences with partial information on their relative order. Recently, some approaches have been proposed to compare draft genomes by comparing paths in de Bruijn graphs, which are constructed by many practical genome assemblers. In this article we introduce gcBB, a method for comparing genomes represented as succinct colored de Bruijn graphs directly, without resorting to sequence alignments, by means of the entropy and expectation measures based on the Burrows-Wheeler Similarity Distribution. We also introduce…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenomics and Phylogenetic Studies
