# High-throughput analyses of a reconstituted diversity-generating retroelement identify intrinsic and extrinsic determinants of diversification

**Authors:** Irem Unlu, Marina K. Smiley, Vladimir Potapov, Yoan Renoux-Martin, Zhi-Yi Sun, Hoong Chuin Lim

PMC · DOI: 10.1371/journal.pgen.1012038 · PLOS Genetics · 2026-02-05

## TL;DR

Researchers moved a natural 'evolution engine' into lab bacteria and found ways to boost its mutation power over 1000-fold, revealing how DNA replication helps it work.

## Contribution

The study reconstituted a DGR system in E. coli and identified intrinsic and extrinsic factors that dramatically enhance its mutagenic retrohoming efficiency.

## Key findings

- DGR efficiency was increased over 1000-fold by identifying and manipulating key regulatory factors in E. coli.
- AAC motifs in DGR templates show a distinct error profile, biasing mutations toward antigen recognition regions.
- DNA replication directionality and proximity to the replication origin enhance DGR activity.

## Abstract

Diversity-Generating Retroelements (DGRs) are specialized genetic systems typically harnessed in nature to evolve new molecular recognition. This mechanism, known as mutagenic retrohoming, relies on an error-prone reverse transcriptase (bRT) that introduces errors at template adenines, followed by the incorporation of the resulting mutagenized complementary DNA (cDNA) into a homologous target gene. Although widely distributed, DGRs are conspicuously absent from key bacterial models, limiting our understanding of their functionality in these hosts and their potential as engineering tools. Here, we demonstrate the ‘plug-and-play’ nature of the Bordetella phage BPP-1 DGR by successfully reconstituting the mutagenic retrohoming mechanism in Escherichia coli. Using high-throughput tools available in this tractable bacterium, we identified key regulatory factors that allowed us to enhance DGR efficiency over 1000-fold. Systematic analysis defines how sequence context governs bRT’s fidelity, uncovering a distinct error profile for the AAC motifs prevalent in natural DGR templates. This intrinsic bias prioritizes the sampling of residues essential for antigen recognition, effectively focusing the evolutionary search within the most productive regions of sequence space. Furthermore, a transposon sequencing screen identified the single-stranded DNA exonuclease ExoI as an inhibitor of DGR activity. While removing ExoI enhanced activity by more than ten-fold, we found that its nuclease activity was dispensable for this inhibition, suggesting a non-catalytic mechanism. Finally, a genome-scale survey highlighted enhanced DGR efficiency at targets located near the replication origin and oriented outwardly from it. This bias is clearly linked to replication directionality, suggesting that certain aspects of DNA replication cycles promote mutagenic retrohoming. Collectively, our work reveals previously unappreciated mechanistic features of DGRs and establishes this reconstituted system as a powerful platform for targeted gene diversification and clarifying the molecular mechanism of mutagenic retrohoming.

Our study focuses on Diversity-Generating Retroelements (DGRs), a biological “evolution engine” that microbes and viruses use to rapidly develop new functions within specific genes. DGRs work by using a sloppy enzyme that intentionally makes mistakes as it converts an RNA sequence into a new DNA strand. This mutated DNA is then incorporated into a target gene, creating a library of diverse variants from which new, beneficial functions can arise. To study this more effectively, we moved a viral DGR system into a common laboratory bacterium Escherichia coli. Using high-throughput methods, we discovered how the system tilts the scales towards mutations that help proteins recognize new targets. We also found ways to boost this activity by 1000-fold. Finally, we found that DNA replication is a major driver of how efficiently the DGR works, offering new clues into how the mutations are incorporated into the target gene. By uncovering these rules, we provide a new roadmap for fine-tuning this “engine” for future applications in biotechnology.

## Linked entities

- **Genes:** brt (Reverse transcriptase) [NCBI Gene 2717203], exoI (Beta-hexosaminidase) [NCBI Gene 59246215]
- **Species:** Bordetella (taxon 517), Escherichia coli (taxon 562), Mus musculus (taxon 10090)

## Full-text entities

- **Genes:** sbcB (exodeoxyribonuclease I) [NCBI Gene 946529] {aka ECK2005, cpeA, exoI, rmuA, xonA}, F2R (coagulation factor II thrombin receptor) [NCBI Gene 2149] {aka CF2R, HTR, PAR-1, PAR1, TR}, tufA (translation elongation factor Tu 1) [NCBI Gene 947838] {aka ECK3326, kirT, pulT}, MT1E (metallothionein 1E) [NCBI Gene 4493] {aka MT-1E, MT-IE, MT1, MTD}, brt (Reverse transcriptase) [NCBI Gene 2717203]
- **Chemicals:** adenine (MESH:D000225), water (MESH:D014867), NaCl (MESH:D012965), asparagine (MESH:D001216), agar (MESH:D000362), polyacrylamide (MESH:C016679), Spectinomycin (MESH:D000198), serine (MESH:D012694), aspartic acid (MESH:D001224), tyrosine (MESH:D014443), proton (MESH:D011522), C (MESH:D002244), DAP (MESH:C041756), dGTP (MESH:C029603), Tn (MESH:C009497), ACC (MESH:C023863), BPP-1 (MESH:C006791), tetracycline (MESH:D013752), glycine (MESH:D005998), Alexa488N (-), cytosine (MESH:D003596), chloramphenicol (MESH:D002701), glucose (MESH:D005947), NAT (MESH:C041665), poly-C (MESH:D011066), dCTP (MESH:C024107), A (MESH:D001151), Agarose (MESH:D012685), ampicillin (MESH:D000667), CAN (MESH:C004653), Kanamycin (MESH:D007612), IPTG (MESH:D007544), N (MESH:D009584), NAG (MESH:D000117), Arabinose (MESH:D001089), Guanosine (MESH:D006151)
- **Species:** Escherichia coli str. K-12 substr. MG1655 (no rank) [taxon 511145], Brevundimonas sp. PP1 (species) [taxon 231982], Saccharomyces cerevisiae (baker's yeast, species) [taxon 4932], Escherichia coli (E. coli, species) [taxon 562], Bordetella (genus) [taxon 517]
- **Mutations:** S1230S, G-to-T, K580A, C in 25, cytosine in the -2, A-to-N, aspartic acid instead of the tyrosine, AAG to stop, A to G, cytosine at the -1
- **Cell lines:** MG1655 — Homo sapiens (Human), Maple syrup urine disease, Transformed cell line (CVCL_D514), HCL1 — Homo sapiens (Human), Cutis laxa, Finite cell line (CVCL_9V72)

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12875486/full.md

## Figures

7 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12875486/full.md

## References

65 references — full list in the complete paper: https://tomesphere.com/paper/PMC12875486/full.md

---
Source: https://tomesphere.com/paper/PMC12875486