Rate Matrix Estimation From Site Frequency Data
Conrad J. Burden, Yurong Tang

TL;DR
This paper introduces a method to estimate evolutionary rate matrices from site frequency data, assuming a Wright-Fisher model, multiple genomes, and neutral sites, without requiring reversibility of the rate matrix.
Contribution
It provides a novel estimation procedure based on an approximate stationary solution to the Wright-Fisher model's Kolmogorov equation, accommodating non-reversible rate matrices.
Findings
Effective estimation of rate matrices from genomic data.
No reversibility assumption required for the rate matrix.
Applicable to moderate-sized genome alignments.
Abstract
A procedure is described for estimating evolutionary rate matrices from observed site frequency data. The procedure assumes (1) that the data are obtained from a constant size population evolving according to a stationary Wright-Fisher model; (2) that the data consist of a multiple alignment of a moderate number of sequenced genomes drawn randomly from the population; and (3) that within the genome a large number of independent, neutral sites evolving with with a common mutation rate matrix can be identified. No restrictions are imposed on the scaled rate matrix other than that the off-diagonal elements are positive and <<1, and that the rows sum to zero. In particular the rate matrix is not assumed to be reversible. The key to the method is an approximate stationary solution to the forward Kolmogorov equation for the multi-allele neutral Wright-Fisher model in the limit of low mutation…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEvolution and Genetic Dynamics · Genetic diversity and population structure · Genetic Mapping and Diversity in Plants and Animals
