Efficient Constraining of Transcoding in DNA-Based Image Storage
Sara Al Sayyed, Aline Roumy, Thomas Maugey

TL;DR
This paper introduces two transcoding methods for DNA-based image storage that effectively balance data compression and error minimization by controlling homopolymer constraints, reducing synthesis costs and sequencing errors.
Contribution
It proposes novel transcoding techniques that enforce homopolymer constraints to minimize errors while maintaining near-optimal compression rates.
Findings
First method eliminates homopolymers, increasing data rate by 2.14%.
Second method allows limited homopolymer increase with minimal impact.
Both methods effectively balance error reduction and compression efficiency.
Abstract
DNA has emerged as a promising alternative for long-term data storage due to its high capacity, durability, and low-energy potential. However, storing data in DNA presents several challenges. First, it requires complex and costly biochemical processes, making efficient compression crucial to reducing DNA synthesis time and cost. Second, these processes are prone to errors that must be avoided and/or corrected. In particular, homopolymers (repetitions of the same nucleotide) are a wellknown source of errors during the sequencing step. Avoiding such repetitions helps mitigate errors but introduces a constraint that may increase the data compression rate. In this paper, we propose two transcoding methods that address these two key challenges: reducing data rate and minimizing errors. The first method strictly enforces the error-minimization constraint by eliminating homopolymers of a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDNA and Biological Computing · DNA and Nucleic Acid Chemistry · Algorithms and Data Compression
