Central Limit Theorem for Mutation Systems
Liav Koram, Ohad Elishco

TL;DR
This paper establishes a Central Limit Theorem for mutation systems modeling in-vivo DNA data storage, characterizing their stochastic fluctuations and providing a foundation for understanding sequence evolution over time.
Contribution
It introduces a CLT for mutation systems, using spectral analysis and martingale techniques to analyze sequence evolution in DNA storage.
Findings
Derived the asymptotic distribution of sequence counts
Explicitly calculated the limiting covariance matrix
Provided a theoretical framework for error analysis in DNA storage
Abstract
DNA-based storage has emerged as a promising alternative to traditional data storage methods, offering unmatched advantages in data density, longevity, and sustainability. Two main approaches have developed: in-vitro storage, where information is synthesized in controlled environments, and in-vivo storage, where data is embedded within an organism's DNA for enhanced confidentiality and protection. While in-vivo DNA storage provides unique advantages, it faces significant challenges from mutations, including duplications, deletions, and substitutions, which cause sequence evolution over time. Thus, in-vivo systems experience continuous sequence alterations that increase length and change composition, making error correction particularly challenging. We study the asymptotic behavior of mutation systems, which model the probabilistic evolution of sequences over a finite alphabet, and are…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
