A simple, general result for the variance of substitution number in molecular evolution
Bahram Houchmandzadeh, Marcel Vallade

TL;DR
This paper derives a simple formula for the variance of the number of substitutions in molecular evolution, facilitating analysis of evolutionary models and the dispersion index R, which is crucial for understanding neutral evolution.
Contribution
It provides a straightforward method to compute the variance of substitution numbers, applicable to both short and long evolutionary times, enhancing previous complex approaches.
Findings
Variance computation is as simple as mean calculation.
The dispersion index R is generally ≥1, but significantly larger values require specific model assumptions.
The result aids in analyzing the neutral theory of molecular evolution.
Abstract
The number of substitutions (of nucleotides, amino acids, ...) that take place during the evolution of a sequence is a stochastic variable of fundamental importance in the field of molecular evolution. Although the mean number of substitutions during molecular evolution of a sequence can be estimated for a given substitution model, no simple solution exists for the variance of this random variable. We show in this article that the computation of the variance is as simple as that of the mean number of substitutions for both short and long times. Apart from its fundamental importance, this result can be used to investigate the dispersion index R , i.e. the ratio of the variance to the mean substitution number, which is of prime importance in the neutral theory of molecular evolution. By investigating large classes of substitution models, we demonstrate that although R\ge1 , to obtain R…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
