TL;DR
This paper introduces a novel indexing method for variation graphs using de Bruijn graphs and the Burrows-Wheeler transform, enabling efficient and space-saving querying within genetic variation representations.
Contribution
It presents a new approach to index variation graphs by leveraging de Bruijn graphs and compression techniques, improving efficiency and versatility in genetic data analysis.
Findings
The proposed index is fast and space-efficient.
It effectively compresses redundant subgraphs.
Implemented in the vg toolkit for practical use.
Abstract
Variation graphs, which represent genetic variation within a population, are replacing sequences as reference genomes. Path indexes are one of the most important tools for working with variation graphs. They generalize text indexes to graphs, allowing one to find the paths matching the query string. We propose using de Bruijn graphs as path indexes, compressing them by merging redundant subgraphs, and encoding them with the Burrows-Wheeler transform. The resulting fast, space-efficient, and versatile index is used in the variation graph toolkit vg.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
