Joint optimization of replication potential and information storage set the letter size of primordial genetic alphabet
Hemachander Subramanian

TL;DR
This paper explores the evolutionary advantages of a four-letter genetic alphabet over a two-letter one, demonstrating that four-letter sequences are necessary for predictable folding and optimal replication in early RNA/DNA-like molecules.
Contribution
It introduces a model showing that four-letter heteropolymer sequences are essential for stable secondary structure and replication potential, considering asymmetric cooperativity.
Findings
Four-letter sequences enable predictable secondary structure formation.
Palindromic sequences optimize replication potential.
Four-letter alphabet outcompetes two-letter in early evolution scenarios.
Abstract
The simplest possible informational heteropolymer requires only a two-letter alphabet to be able to store information. The evolutionary choice of four monomers in the informational biomolecules RNA/DNA or their progenitors is intriguing, given the inherent difficulties in the simultaneous and localized prebiotic synthesis of all four monomers of progenitors of DNA from common precursors on early Earth. Excluding the scenario where a two-letter alphabet genome eventually expanded to include two more letters to code for more amino acids on teleological grounds, we show here that a heteropolymer sequence in the RNA-world-like scenario would have had to be composed of at least four letters in order to predictably fold into a specific secondary structure, and hence must have outcompeted the two-letter alphabet genomes. Using a model that we previously used to demonstrate the evolutionary…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRNA and protein synthesis mechanisms · DNA and Biological Computing · Fractal and DNA sequence analysis
