Learning Structurally Stabilized Representations for Multi-modal Lossless DNA Storage
Ben Cao, Tiantian He, Xue Li, Bin Wang, Xiaohu Wu, Qiang Zhang,, Yew-Soon Ong

TL;DR
This paper introduces RSRL, a novel end-to-end model inspired by error correction and structural biology, for learning highly durable, dense, and lossless multi-modal DNA storage representations, outperforming existing methods.
Contribution
The paper proposes RSRL, a new model that integrates Reed-Solomon coding, error correction, and biological stabilization to improve DNA data storage.
Findings
Higher information density achieved
Lower error rates in storage
Enhanced durability of representations
Abstract
In this paper, we present Reed-Solomon coded single-stranded representation learning (RSRL), a novel end-to-end model for learning representations for multi-modal lossless DNA storage. In contrast to existing learning-based methods, the proposed RSRL is inspired by both error-correction codec and structural biology. Specifically, RSRL first learns the representations for the subsequent storage from the binary data transformed by the Reed-Solomon codec. Then, the representations are masked by an RS-code-informed mask to focus on correcting the burst errors occurring in the learning process. With the decoded representations with error corrections, a novel biologically stabilized loss is formulated to regularize the data representations to possess stable single-stranded structures. By incorporating these novel strategies, the proposed RSRL can learn highly durable, dense, and lossless…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning
MethodsFocus
