Efficient Encoding/Decoding of Irreducible Words for Codes Correcting Tandem Duplications
Yeow Meng Chee, Johan Chrisnata, Han Mao Kiah, Tuan Thanh Nguyen

TL;DR
This paper develops efficient encoding and decoding algorithms for irreducible words used in DNA data storage codes that correct tandem duplications, achieving near-optimal rates and reduced space complexity.
Contribution
It introduces an $( ext{ell},m)$-finite state encoder with near-optimal rate and provides ranking/unranking algorithms that reduce space requirements for irreducible words.
Findings
Encoder achieves rate within epsilon of optimal
Algorithms enable efficient ranking and unranking of irreducible words
Reduced space complexity for encoding algorithms
Abstract
Tandem duplication is the process of inserting a copy of a segment of DNA adjacent to the original position. Motivated by applications that store data in living organisms, Jain et al. (2017) proposed the study of codes that correct tandem duplications. Known code constructions are based on {\em irreducible words}. We study efficient encoding/decoding methods for irreducible words. First, we describe an -finite state encoder and show that when and , the encoder achieves rate that is away from the optimal. Next, we provide ranking/unranking algorithms for irreducible words and modify the algorithms to reduce the space requirements for the finite state encoder.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
