On the Reverse-Complement String-Duplication System
Eyar Ben-Tolila, Moshe Schwartz

TL;DR
This paper investigates the reverse-complement string-duplication system inspired by DNA mutations, classifies conditions for full expressiveness, analyzes capacity and entropy, and constructs codes for error correction with minimal redundancy.
Contribution
It provides a complete classification of the system's expressiveness, analyzes capacity and entropy for binary systems, and introduces codes for correcting reverse-complement duplications.
Findings
System has full expressiveness under certain conditions.
Binary systems with duplication length 2 have full capacity but zero entropy-rate.
Constructed codes correct single reverse-complement duplications with constant redundancy.
Abstract
Motivated by DNA storage in living organisms, and by known biological mutation processes, we study the reverse-complement string-duplication system. We fully classify the conditions under which the system has full expressiveness, for all alphabets and all fixed duplication lengths. We then focus on binary systems with duplication length and prove that they have full capacity, yet surprisingly, have zero entropy-rate. Finally, by using binary single burst-insertion correcting codes, we construct codes that correct a single reverse-complement duplication of odd length, over any alphabet. The redundancy (in bits) of the constructed code does not depend on the alphabet size.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDNA and Biological Computing · Advanced biosensing and bioanalysis techniques · Algorithms and Data Compression
