Ultrafast topological data analysis reveals pandemic-scale dynamics of convergent evolution
Michael Bleher, Lukas Hahn, Maximilian Neumann, Zachary Ardern, Juan Angel Patino-Galindo, Mathieu Carriere, Ulrich Bauer, Raul Rabadan, Andreas Ott

TL;DR
This paper introduces EVOtRec, a fast, scalable topological data analysis method that detects convergent evolution in large genomic datasets without phylogenetic trees, enabling real-time tracking of adaptive variants.
Contribution
EVOtRec is a novel organism-agnostic approach that infers convergent genomic variants directly from topological patterns, outperforming traditional phylogeny-based methods in speed while maintaining accuracy.
Findings
EVOtRec accurately identifies variants under positive selection.
It is orders of magnitude faster than existing phylogeny-based methods.
Successfully applied to large datasets of SARS-CoV-2, H5N1, and HIV-1 genomes.
Abstract
Genome variants which re-occur independently across evolutionary lineages are key molecular signatures of adaptation. Inferring the dynamics of such genetic changes from pandemic-scale genomic datasets is now possible, which opens up unprecedented insight into evolutionary processes. However, existing approaches depend on the construction of accurate phylogenetic trees, which remains challenging at scale. Here we present EVOtRec, an organism-agnostic, fast and scalable Topological Data Analysis approach that enables the inference of convergently evolving genomic variants over time directly from topological patterns in the dataset, without requiring the construction of a phylogenetic tree. Using data from both simulations and published experiments, we show that EVOtRec can robustly identify variants under positive selection and performs orders of magnitude faster than state-of-the-art…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
Topicsvaccines and immunoinformatics approaches · Immune responses and vaccinations · Bioinformatics and Genomic Networks
