# Generalising rate heterogeneity across sites in statistical   phylogenetics

**Authors:** Sarah E. Heaps, Tom M. W. Nye, Richard J. Boys, Tom A. Williams,, Svetlana Cherlin, T. Martin Embley

arXiv: 1702.05972 · 2019-05-03

## TL;DR

This paper introduces quadratic models for rate heterogeneity in phylogenetics, allowing for more complex variation in evolutionary rates and constraints across sites and taxa, improving inference accuracy.

## Contribution

It extends existing linear rate heterogeneity models to quadratic ones, enabling modeling of variation in selective coefficients and sequence composition.

## Key findings

- Quadratic models better fit diverse biological data.
- Models capture variation in both site-specific and taxon-specific rates.
- Software implementation facilitates practical application.

## Abstract

Phylogenetics uses alignments of molecular sequence data to learn about evolutionary trees relating species. Along branches, sequence evolution is modelled using a continuous-time Markov process characterised by an instantaneous rate matrix. Early models assumed the same rate matrix governed substitutions at all sites of the alignment, ignoring variation in evolutionary pressures. Substantial improvements in phylogenetic inference and model fit were achieved by augmenting these models with multiplicative random effects that describe the result of variation in selective constraints and allow sites to evolve at different rates which linearly scale a baseline rate matrix. Motivated by this pioneering work, we consider an extension using a quadratic, rather than linear, transformation. The resulting models allow for variation in the selective coefficients of different types of point mutation at a site in addition to variation in selective constraints.   We derive properties of the extended models. For certain non-stationary processes, the extension gives a model that allows variation in sequence composition both across sites and taxa. We adopt a Bayesian approach, describe an MCMC algorithm for posterior inference and provide software. Our quadratic models are applied to alignments spanning the tree of life and compared with site-homogeneous and linear models.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1702.05972/full.md

## Figures

18 figures with captions in the complete paper: https://tomesphere.com/paper/1702.05972/full.md

## References

42 references — full list in the complete paper: https://tomesphere.com/paper/1702.05972/full.md

---
Source: https://tomesphere.com/paper/1702.05972