# Molecular Complexity Constrained Early Amino Acid Recruitment into the Genetic Code

**Authors:** Syeda Ameena Hashmi, Hamed Chok, Ricardo Cabrera, Celia Blanco

PMC · DOI: 10.1093/gbe/evag012 · Genome Biology and Evolution · 2026-01-20

## TL;DR

The study explores how molecular complexity influenced the order in which amino acids were added to the genetic code.

## Contribution

A new complexity-based chronology of amino acid recruitment is proposed using chemical graph and information theory metrics.

## Key findings

- A minimum spanning tree derived from molecular complexity aligns with prebiotic and genomic chronologies.
- Amino acids with similar complexity show greater mutational connectivity, suggesting structural constraints shaped the genetic code.
- Molecular complexity correlates with amino acid enrichment in LUCA's inferred proteome.

## Abstract

Previously proposed chronologies of amino acid incorporation into the genetic code rely on consensus rankings derived from prebiotic synthesis experiments, biosynthetic pathways, or genomic trends. However, the role of intrinsic molecular properties in shaping amino acid recruitment remains largely underexplored. In this study, we reconstruct a complexity-based amino acid chronology by integrating 16 molecular complexity metrics from chemical graph and information theory. Unlike approaches influenced by environmental variability, detection biases, or the evolutionary constraints of genome-based chronologies, our method provides a perspective on amino acid incorporation independent of these factors. Instead of imposing a linear ranking, we derive a minimum spanning tree capturing complexity-based relationships between amino acids. The resulting hierarchy places structurally simple amino acids in basal positions, while biosynthetically complex residues appear later, aligning with existing prebiotic and genomic chronologies. Furthermore, amino acids positioned closer in the complexity space exhibit significantly greater mutational connectivity than expected by chance, suggesting that molecular complexity reflects underlying structural considerations that constrained the genetic code's evolutionary pathways. This supports the idea that the code evolved not only to maintain biochemical stability but also to facilitate complexity-preserving substitutions, ensuring smooth adaptive transitions while minimizing energetic cost differences. Additionally, molecular complexity significantly correlates with amino acid enrichment in LUCA's inferred proteome, reinforcing its role as a fundamental constraint on early protein evolution. Our approach, rooted in intrinsic molecular properties rather than external contingencies, offers new insights into the constraints shaping the genetic code and expands the scope for identifying universal principles of biochemical evolution.

## Full-text entities

- **Chemicals:** Amino Acid (MESH:D000596)

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12951668/full.md

## Figures

5 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12951668/full.md

## References

91 references — full list in the complete paper: https://tomesphere.com/paper/PMC12951668/full.md

---
Source: https://tomesphere.com/paper/PMC12951668