Selective Constraints on Amino Acids Estimated by a Mechanistic Codon Substitution Model with Multiple Nucleotide Changes
Sanzo Miyazawa

TL;DR
This paper introduces a codon-based substitution model that estimates gene-specific selective constraints and mutational tendencies, improving the fit to empirical matrices and aiding phylogenetic analysis.
Contribution
It develops a mechanistic codon substitution model with multiple nucleotide changes that better fits empirical data and captures protein-specific constraints.
Findings
Multiple nucleotide changes improve model fit.
Selective constraints are more protein-specific than species-specific.
Model provides biologically meaningful insights at nucleotide and amino acid levels.
Abstract
Empirical substitution matrices represent the average tendencies of substitutions over various protein families by sacrificing gene-level resolution. We develop a codon-based model, in which mutational tendencies of codon, a genetic code, and the strength of selective constraints against amino acid replacements can be tailored to a given gene. First, selective constraints averaged over proteins are estimated by maximizing the likelihood of each 1-PAM matrix of empirical amino acid (JTT, WAG, and LG) and codon (KHG) substitution matrices. Then, selective constraints specific to given proteins are approximated as a linear function of those estimated from the empirical substitution matrices. Akaike information criterion (AIC) values indicate that a model allowing multiple nucleotide changes fits the empirical substitution matrices significantly better. Also, the ML estimates of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
