# A reprogrammed genetic code consisting of 32 distinct amino acids

**Authors:** Takayuki Katoh, Hiroaki Suga

PMC · DOI: 10.1093/nar/gkag140 · Nucleic Acids Research · 2026-02-18

## TL;DR

Scientists expanded the genetic code to include 32 amino acids by reassigning codons, enabling the incorporation of nonproteinogenic amino acids for drug discovery.

## Contribution

A new method using engineered tRNAs and optimized translation conditions expands the genetic code to 32 amino acids.

## Key findings

- The genetic code was expanded to include 32 amino acids using codon box division and engineered tRNAs.
- Eleven elongator and one initiator nonproteinogenic amino acids were successfully incorporated.
- The platform enables the creation of diverse macrocyclic peptide libraries for drug discovery.

## Abstract

Sense codon reassignment enables ribosomal incorporation of nonproteinogenic amino acids (npAAs) at any of the 61 sense codons. Because npAAs replace proteinogenic amino acids (pAAs), the total number of available building blocks usually remains limited to 20. To overcome this, we previously introduced “artificial codon box division”, where four-codon boxes (e.g. Val GUN) are split into distinct sets (e.g. GUY and GUG) using in vitro transcribed transfer RNAs (tRNAs) lacking nucleotide modifications. This allows two different amino acids—a pAA and an npAA—to be assigned within the same original box. While we previously demonstrated this by incorporating 23 amino acids, low incorporation efficiency hindered further expansion. Here, we applied our engineered tRNAs, tRNAPro1E2 and tRNAiniP, to the codon box division framework and optimized translation conditions to facilitate multiple npAA incorporations. Consequently, we successfully expanded the genetic code to 32 amino acids, incorporating 11 elongator npAAs and 1 initiator npAA while maintaining all 20 pAAs. Notably, these npAAs include therapeutically significant monomers such as β-amino, d-amino, and N-methyl amino acids, as well as an initiator N-chloroacetyl-d-tyrosine for peptide macrocyclization. This platform offers vast potential for generating diverse macrocyclic peptide libraries with unique chemical entities for drug discovery.

Graphical Abstract

## Linked entities

- **Chemicals:** N-chloroacetyl-d-tyrosine (PubChem CID 25021406)

## Full-text entities

- **Chemicals:** spermidine (MESH:D013095), UTP (MESH:D014544), sodium acetate (MESH:D019346), KCl (MESH:D011189), Thr (MESH:D013912), dimethylsulfoxide (MESH:D004121), ATP (MESH:D000255), xylene cyanol (MESH:C048951), KOH (MESH:C029943), Cys (MESH:D003545), chloroform (MESH:D002725), -amino acids (MESH:D000596), alpha-cyano-4-hydroxycinnamic acid (MESH:C007175), sulfhydryl (MESH:D013438), creatine phosphate (MESH:D010725), Bicine (MESH:C027494), Phe (MESH:D010649), 2-thienylalanine (MESH:C100177), 10-formyl-5,6,7,8-tetrahydrofolic acid (-), HEPES (MESH:D006531), Na+ (MESH:D012964), CTP (MESH:D003570), glycerol (MESH:D005990), K+ (MESH:D011188), GAG (MESH:D006025), HCl (MESH:D006851), Glu (MESH:D018698), SDS (MESH:D012967), magnesium acetate (MESH:C000656591), DTT (MESH:D004229), ethanol (MESH:D000431), Val (MESH:D014633), phenol (MESH:D019800), Leu (MESH:D007930), peptides (MESH:D010455), 4-methylphenylalanine (MESH:C067359), Ala (MESH:D000409), polyacrylamide (MESH:C016679), N-acetylproline (MESH:C055247), Triton X-100 (MESH:D017830), GMP (MESH:D006157), potassium acetate (MESH:D019347), MgCl2 (MESH:D015636), GTP (MESH:D006160), Abu (MESH:C012223), thioether (MESH:D013440)
- **Species:** Theileria sp. 7 (species) [taxon 2874162], Escherichia coli (E. coli, species) [taxon 562]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12914324/full.md

## Figures

6 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12914324/full.md

## References

50 references — full list in the complete paper: https://tomesphere.com/paper/PMC12914324/full.md

---
Source: https://tomesphere.com/paper/PMC12914324