Linear Time Construction of Indexable Elastic Founder Graphs
Nicola Rizzo, Veli M\"akinen

TL;DR
This paper presents a method to construct indexable elastic founder graphs from multiple sequence alignments in linear time, enabling efficient pattern matching in genomics applications.
Contribution
It improves the preprocessing and construction algorithms for indexable elastic founder graphs to linear time complexity, enhancing scalability and efficiency.
Findings
Preprocessing time reduced to O(mn)
EFG construction achieved in O(n) time
Supports fast pattern matching in genomic data
Abstract
Pattern matching on graphs has been widely studied lately due to its importance in genomics applications. Unfortunately, even the simplest problem of deciding if a string appears as a subpath of a graph admits a quadratic lower bound under the Orthogonal Vectors Hypothesis (Equi et al. ICALP 2019, SOFSEM 2021). To avoid this bottleneck, the research has shifted towards more specific graph classes, e.g. those induced from multiple sequence alignments (MSAs). Consider segmenting into blocks , , , . The distinct strings in the rows of the blocks, after the removal of gap symbols, form the nodes of an elastic founder graph (EFG) where the edges represent the original connections observed in the MSA. An EFG is called indexable if a node label occurs as a prefix of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAlgorithms and Data Compression · Genomics and Phylogenetic Studies · RNA and protein synthesis mechanisms
