Incorporating compositional heterogeneity into Lie Markov models for phylogenetic inference
Naomi E. Hannaford, Sarah E. Heaps, Tom M. W. Nye, Tom A. Williams and, T. Martin Embley

TL;DR
This paper introduces a non-stationary, non-reversible Lie Markov model for phylogenetics that can distinguish rooted trees, addressing limitations of traditional models that assume stationarity and reversibility.
Contribution
It develops a novel non-stationary, non-reversible Lie Markov model for phylogenetic inference, enabling root identification and better biological interpretation.
Findings
Model distinguishes rooted trees where traditional models cannot.
Bayesian inference with MCMC implemented for the new model.
Application shows improved root identification in biological data.
Abstract
Phylogenetics uses alignments of molecular sequence data to learn about evolutionary trees. Substitutions in sequences are modelled through a continuous-time Markov process, characterised by an instantaneous rate matrix, which standard models assume is time-reversible and stationary. These assumptions are biologically questionable and induce a likelihood function which is invariant to a tree's root position. This hampers inference because a tree's biological interpretation depends critically on where it is rooted. Relaxing both assumptions, we introduce a model whose likelihood can distinguish between rooted trees. The model is non-stationary, with step changes in the instantaneous rate matrix at each speciation event. Exploiting recent theoretical work, each rate matrix belongs to a non-reversible family of Lie Markov models. These models are closed under matrix multiplication, so our…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
