# Efficient DNA Coding Algorithm for Polymerase Chain Reaction Amplification Information Retrieval

**Authors:** Qing Wang, Shufang Zhang, Yuhui Li

PMC · DOI: 10.3390/ijms25126449 · 2024-06-11

## TL;DR

This paper introduces a new DNA coding algorithm that improves PCR amplification accuracy and storage density by reducing nonspecific pairing.

## Contribution

The novel ECA-PCRAIR algorithm optimizes DNA storage density and amplification accuracy through variable-length scanning and constraint correction.

## Key findings

- ECA-PCRAIR reduces nonspecific primer-DNA pairing probability to 2–25%.
- The algorithm achieves a storage density of 2.14–3.67 bits per nucleotide.

## Abstract

Polymerase Chain Reaction (PCR) amplification is widely used for retrieving information from DNA storage. During the PCR amplification process, nonspecific pairing between the 3’ end of the primer and the DNA sequence can cause cross-talk in the amplification reaction, leading to the generation of interfering sequences and reduced amplification accuracy. To address this issue, we propose an efficient coding algorithm for PCR amplification information retrieval (ECA-PCRAIR). This algorithm employs variable-length scanning and pruning optimization to construct a codebook that maximizes storage density while satisfying traditional biological constraints. Subsequently, a codeword search tree is constructed based on the primer library to optimize the codebook, and a variable-length interleaver is used for constraint detection and correction, thereby minimizing the likelihood of nonspecific pairing. Experimental results demonstrate that ECA-PCRAIR can reduce the probability of nonspecific pairing between the 3’ end of the primer and the DNA sequence to 2–25%, enhancing the robustness of the DNA sequences. Additionally, ECA-PCRAIR achieves a storage density of 2.14–3.67 bits per nucleotide (bits/nt), significantly improving storage capacity.

## Full-text entities

- **Diseases:** injury to people or property (MESH:C000719191)
- **Chemicals:** GC (MESH:C057580)
- **Cell lines:** S2 — Drosophila melanogaster (Fruit fly), Spontaneously immortalized cell line (CVCL_Z232)

## Figures

11 figures with captions in the complete paper: https://tomesphere.com/paper/PMC11204281/full.md

---
Source: https://tomesphere.com/paper/PMC11204281