Mutually Uncorrelated Primers for DNA-Based Data Storage

S. M. Hossein Tabatabaei Yazdi; Han Mao Kiah; Ryan Gabrys; Olgica; Milenkovic

arXiv:1709.05214·cs.IT·September 18, 2017

Mutually Uncorrelated Primers for DNA-Based Data Storage

S. M. Hossein Tabatabaei Yazdi, Han Mao Kiah, Ryan Gabrys, Olgica, Milenkovic

PDF

TL;DR

This paper introduces weakly mutually uncorrelated (WMU) sequences for DNA data storage, ensuring minimal overlap and error correction, with bounds and constructions for optimal code design.

Contribution

It defines WMU sequences with specific constraints and provides bounds and new constructions for balanced, error-correcting WMU codes in DNA data storage.

Findings

01

Derived bounds on WMU code sizes

02

Constructed balanced, error-correcting WMU codes

03

Proposed methods to avoid primer-dimer byproducts

Abstract

We introduce the notion of weakly mutually uncorrelated (WMU) sequences, motivated by applications in DNA-based data storage systems and for synchronization of communication devices. WMU sequences are characterized by the property that no sufficiently long suffix of one sequence is the prefix of the same or another sequence. WMU sequences used for primer design in DNA-based data storage systems are also required to be at large mutual Hamming distance from each other, have balanced compositions of symbols, and avoid primer-dimer byproducts. We derive bounds on the size of WMU and various constrained WMU codes and present a number of constructions for balanced, error-correcting, primer-dimer free WMU codes using Dyck paths, prefix-synchronized and cyclic codes.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.