Learning Structurally Stabilized Representations for Multi-modal   Lossless DNA Storage

Ben Cao; Tiantian He; Xue Li; Bin Wang; Xiaohu Wu; Qiang Zhang,; Yew-Soon Ong

arXiv:2408.00779·cs.LG·August 5, 2024

Learning Structurally Stabilized Representations for Multi-modal Lossless DNA Storage

Ben Cao, Tiantian He, Xue Li, Bin Wang, Xiaohu Wu, Qiang Zhang,, Yew-Soon Ong

PDF

Open Access

TL;DR

This paper introduces RSRL, a novel end-to-end model inspired by error correction and structural biology, for learning highly durable, dense, and lossless multi-modal DNA storage representations, outperforming existing methods.

Contribution

The paper proposes RSRL, a new model that integrates Reed-Solomon coding, error correction, and biological stabilization to improve DNA data storage.

Findings

01

Higher information density achieved

02

Lower error rates in storage

03

Enhanced durability of representations

Abstract

In this paper, we present Reed-Solomon coded single-stranded representation learning (RSRL), a novel end-to-end model for learning representations for multi-modal lossless DNA storage. In contrast to existing learning-based methods, the proposed RSRL is inspired by both error-correction codec and structural biology. Specifically, RSRL first learns the representations for the subsequent storage from the binary data transformed by the Reed-Solomon codec. Then, the representations are masked by an RS-code-informed mask to focus on correcting the burst errors occurring in the learning process. With the decoded representations with error corrections, a novel biologically stabilized loss is formulated to regularize the data representations to possess stable single-stranded structures. By incorporating these novel strategies, the proposed RSRL can learn highly durable, dense, and lossless…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning

MethodsFocus