Fast sequence to graph alignment using the graph wavefront algorithm
Haowen Zhang, Shiqi Wu, Srinivas Aluru, Heng Li

TL;DR
The paper introduces Gwfa, a fast sequence-to-graph alignment algorithm that significantly outperforms existing methods, especially for closely matching sequences, enabling efficient analysis of large pan-genome graphs.
Contribution
The paper presents Gwfa, a novel sequence-to-graph alignment algorithm optimized for speed on similar sequences, with a graph pruning heuristic for further acceleration.
Findings
Gwfa is up to 10,000 times faster than existing algorithms.
Performance improves significantly for closely matching sequences.
Graph pruning yields an additional ~10-fold speedup.
Abstract
Motivation: A pan-genome graph represents a collection of genomes and encodes sequence variations between them. It is a powerful data structure for studying multiple similar genomes. Sequence-to-graph alignment is an essential step for the construction and the analysis of pan-genome graphs. However, existing algorithms incur runtime proportional to the product of sequence length and graph size, making them inefficient for aligning long sequences against large graphs. Results: We propose the graph wavefront alignment algorithm (Gwfa), a new method for aligning a sequence to a sequence graph. Although the worst-case time complexity of Gwfa is the same as the existing algorithms, it is designed to run faster for closely matching sequences, and its runtime in practice often increases only moderately with the edit distance of the optimal alignment. On four real datasets, Gwfa is up to four…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenomics and Phylogenetic Studies · Chromosomal and Genetic Variations · Plant Virus Research Studies
MethodsPruning
