Dimensional reduction for the general Markov model on phylogenetic trees
Jeremy G Sumner

TL;DR
This paper introduces a dimensional reduction technique for the general Markov model on phylogenetic trees, simplifying the model's complexity while preserving the ability to infer evolutionary divergence events.
Contribution
The authors develop a linear combination approach that reduces the model's dimensionality from exponential to quadratic in the number of taxa, enabling more efficient phylogenetic analysis.
Findings
Dimensionality reduced from exponential to quadratic in taxa
Invariant subspace depends bilinearly on model parameters
Method retains ability to identify divergence events
Abstract
We present a method of dimensional reduction for the general Markov model of sequence evolution on a phylogenetic tree. We show that taking certain linear combinations of the associated random variables (site pattern counts) reduces the dimensionality of the model from exponential in the number of extant taxa, to quadratic in the number of taxa, while retaining the ability to statistically identify phylogenetic divergence events. A key feature is the identification of an invariant subspace which depends only bilinearly on the model parameters, in contrast to the usual multi-linear dependence in the full space. We discuss potential applications including the computation of split (edge) weights on phylogenetic trees from observed sequence data.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenomics and Phylogenetic Studies · Evolution and Paleontology Studies · Genetic diversity and population structure
