Exploring the consequences of lack of closure in codon models
Michael D. Woodhams, Jeremy G. Sumner, David A. Liberles, Michael A., Charleston, Barbara R. Holland

TL;DR
This study investigates how non-closure in codon models affects the accuracy of estimating positive selection and branch lengths, revealing potential biases and the limited benefits of using closed DNA models.
Contribution
It demonstrates that non-closed codon models can lead to biased estimates and shows that using closed DNA models does not necessarily improve accuracy in codon model estimations.
Findings
Errors in estimating $$ can reach up to 17%.
Both $$ and branch lengths are mis-estimated under heterogeneity.
Closed DNA models do not significantly reduce estimation errors.
Abstract
Models of codon evolution are commonly used to identify positive selection. Positive selection is typically a heterogeneous process, i.e., it acts on some branches of the evolutionary tree and not others. Previous work on DNA models showed that when evolution occurs under a heterogeneous process it is important to consider the property of model closure, because non-closed models can give biased estimates of evolutionary processes. The existing codon models that account for the genetic code are not closed; to establish this it is enough to show that they are not linear (meaning that the sum of two codon rate matrices in the model is not a matrix in the model). This raises the concern that a single codon model fit to a heterogeneous process might mis-estimate both the effect of selection and branch lengths. Codon models are typically constructed by choosing an underlying DNA model…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenomics and Phylogenetic Studies · RNA and protein synthesis mechanisms · Genetic diversity and population structure
