The Capacity of String-Replication Systems
Farzad Farnoud (Hassanzadeh), Moshe Schwartz, Jehoshua Bruck

TL;DR
This paper explores the potential of string-replication systems to generate vast sequence diversity, providing exact capacities and bounds for four fundamental models, shedding light on genomic sequence evolution.
Contribution
It introduces a formal analysis of the capacity of string-replication systems, including exact calculations and bounds for key models, advancing understanding of genomic sequence complexity.
Findings
Exact capacities for four fundamental string-replication systems.
Bounds on the capacities of these systems.
Insights into the expressive power of replication processes.
Abstract
It is known that the majority of the human genome consists of repeated sequences. Furthermore, it is believed that a significant part of the rest of the genome also originated from repeated sequences and has mutated to its current form. In this paper, we investigate the possibility of constructing an exponentially large number of sequences from a short initial sequence and simple replication rules, including those resembling genomic replication processes. In other words, our goal is to find out the capacity, or the expressive power, of these string-replication systems. Our results include exact capacities, and bounds on the capacities, of four fundamental string-replication systems.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAlgorithms and Data Compression · DNA and Biological Computing · Genomics and Phylogenetic Studies
