# Machine learning–directed massively parallel programmable nucleic acid amplification

**Authors:** Zhi Weng, Wenle Huang, Yi Wu, Xuehao Xiu, Hui Lv, Fei Wang, Xiaolei Zuo, Chunhai Fan, Ping Song

PMC · DOI: 10.1126/sciadv.aec9175 · 2026-03-25

## TL;DR

A machine learning approach enables precise control of DNA amplification, improving data storage and disease detection.

## Contribution

A thermodynamics-based primer-tag strategy with machine learning enhances amplification control for diagnostics and DNA storage.

## Key findings

- Machine learning improved amplification prediction accuracy from R2 = 0.62 to 0.86.
- The method increased DNA data storage density by nearly tenfold and enabled robust steganography.
- It detected rare RNA fusions with 100-fold higher sensitivity in cervical cancer analysis.

## Abstract

Dynamic regulation of amplification efficiency is pivotal yet challenging in molecular diagnostics and DNA data storage. Here, we develop a thermodynamics-based approach to achieve continuous and precise modulation of nucleic acid amplification efficiency. By decoupling sequence specificity from hybridization energy regulation via a primer-tag compensation strategy, we demonstrate programmed amplification with high resolution (33 versus 81%). Leveraging 2483 experimental data, we constructed a machine learning model that improved prediction accuracy from R2 = 0.62 to = 0.86. In DNA data storage, this amplification strategy increases the density for information preview by nearly one order of magnitude and robust file steganography via differential amplification. In clinical validation, our method outperformed uniform amplification in cervical cancer RNA variant analysis, detecting rare RNA fusions and improving detection sensitivity by 100-fold under 104 simulated sequencing depth. This programmable technique is anticipated to extend to single-cell sequencing and spatial transcriptomics, offering a powerful tool for molecular diagnostics and synthetic biology.

Machine learning–guided tunable DNA amplification enables intelligent DNA data storage and ultrasensitive molecular diagnostics.

## Linked entities

- **Diseases:** cervical cancer (MONDO:0002974)

## Full-text entities

- **Genes:** ACTB (actin beta) [NCBI Gene 60] {aka BKRNS, BNS, BRWS1, CSMH, DDS1, PS1TP5BP1}, FGFR3 (fibroblast growth factor receptor 3) [NCBI Gene 2261] {aka ACH, CD333, CEK2, HSFGFR3EX, JTK4}, ITIH3 (inter-alpha-trypsin inhibitor heavy chain 3) [NCBI Gene 3699] {aka H3P, ITI-HC3, SHAP}, TACC3 (transforming acidic coiled-coil containing protein 3) [NCBI Gene 10460] {aka ERIC-1, ERIC1, Tacc4, maskin}
- **Diseases:** malignancies (MESH:D009369), cervical cancer (MESH:D002583), bladder cancer (MESH:D001749)
- **Chemicals:** water (MESH:D014867), oligonucleotide (MESH:D009841), salt (MESH:D012492), Na+ (MESH:D012964), SYBR Green (MESH:C098022), DeltaG (-)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Figures

6 figures with captions in the complete paper: https://tomesphere.com/paper/PMC13015887/full.md

---
Source: https://tomesphere.com/paper/PMC13015887