Inference of higher order substitution dynamics by Markov chain lumping
Olof G\"ornerup, Martin Nilsson Jacobi

TL;DR
This paper uses Markov chain lumping to identify higher order amino acid substitution groups from empirical data, providing a more accurate, assumption-free understanding of genetic code evolution and substitution dynamics.
Contribution
It introduces a novel application of Markov chain lumping to derive amino acid groups directly from empirical substitution matrices without relying on prior assumptions.
Findings
Identifies higher order amino acid substitution groups
Derives aggregation from first principles
Captures multi-level substitution structure more accurately
Abstract
We apply Markov chain lumping techniques to aggregate codons from an empirical substitution matrix. The standard genetic code as well as higher order amino acid substitution groups are identified. Since the aggregates are derived from first principles they do not rely on system dependent assumptions made beforehand, e.g. regarding criteria on what should constitute an amino acid group. We therefore argue that the acquired aggregations more accurately capture the multi-level structure of the substitution dynamics than alternative techniques.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRNA and protein synthesis mechanisms · Genomics and Phylogenetic Studies · Protein Structure and Dynamics
