Dimensional reduction for the general Markov model on phylogenetic trees

Jeremy G Sumner

arXiv:1602.07780·q-bio.PE·November 29, 2016·1 cites

Dimensional reduction for the general Markov model on phylogenetic trees

Jeremy G Sumner

PDF

Open Access

TL;DR

This paper introduces a dimensional reduction technique for the general Markov model on phylogenetic trees, simplifying the model's complexity while preserving the ability to infer evolutionary divergence events.

Contribution

The authors develop a linear combination approach that reduces the model's dimensionality from exponential to quadratic in the number of taxa, enabling more efficient phylogenetic analysis.

Findings

01

Dimensionality reduced from exponential to quadratic in taxa

02

Invariant subspace depends bilinearly on model parameters

03

Method retains ability to identify divergence events

Abstract

We present a method of dimensional reduction for the general Markov model of sequence evolution on a phylogenetic tree. We show that taking certain linear combinations of the associated random variables (site pattern counts) reduces the dimensionality of the model from exponential in the number of extant taxa, to quadratic in the number of taxa, while retaining the ability to statistically identify phylogenetic divergence events. A key feature is the identification of an invariant subspace which depends only bilinearly on the model parameters, in contrast to the usual multi-linear dependence in the full space. We discuss potential applications including the computation of split (edge) weights on phylogenetic trees from observed sequence data.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenomics and Phylogenetic Studies · Evolution and Paleontology Studies · Genetic diversity and population structure