Ultrafast learning of 4-node hybridization cycles in phylogenetic networks using algebraic invariants
Zhaoxing Wu, Claudia Solis-Lemus

TL;DR
This paper introduces a fast, algebraic invariant-based method for reconstructing 4-node hybridization cycles in phylogenetic networks, significantly improving speed over existing approaches while maintaining accuracy.
Contribution
It is the first to define phylogenetic invariants on concordance factors for level-1 networks, enabling optimization-free inference under the multispecies coalescent model.
Findings
Method is at least 10 times faster than existing network methods.
Accurately reconstructs phylogenetic networks in simulated scenarios.
Successfully applied to the Canis genus data.
Abstract
Motivation: The abundance of gene flow in the Tree of Life challenges the notion that evolution can be represented with a fully bifurcating process, as this process cannot capture important biological realities like hybridization, introgression, or horizontal gene transfer. Coalescent-based network methods are increasingly popular, yet not scalable for big data, because they need to perform a heuristic search in the space of networks as well as numerical optimization that can be NP-hard. Results: Here, we introduce a novel method to reconstruct phylogenetic networks based on algebraic invariants. While there is a long tradition of using algebraic invariants in phylogenetics, our work is the first to define phylogenetic invariants on concordance factors (frequencies of 4-taxon splits in the input gene trees) to identify level-1 phylogenetic networks under the multispecies coalescent…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenomics and Phylogenetic Studies · Evolution and Paleontology Studies · Genetic diversity and population structure
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
