# A biologically constrained encoding solution for long-term storage of   images onto synthetic DNA

**Authors:** Melpomeni Dimopoulou, Marc Antonini, Pascal Barbry, Raja Appuswamy

arXiv: 1904.03024 · 2019-04-08

## TL;DR

This paper presents a biologically constrained encoding method for storing digital images in synthetic DNA, improving robustness to errors and optimizing compression for long-term data storage.

## Contribution

It introduces a new DNA encoding algorithm that enhances error robustness and allows control over compression ratio, advancing DNA data storage technology.

## Key findings

- Improved coding potential over previous methods
- Enhanced robustness to synthesis and sequencing errors
- Controlled DNA strand length through optimized allocation

## Abstract

Living in the age of the digital media explosion, the amount of data that is being stored increases dramatically. However, even if existing storage systems suggest efficiency in capacity, they are lacking in durability. Hard disks, flash, tape or even optical storage have limited lifespan in the range of 5 to 20 years. Interestingly, recent studies have proven that it was possible to use synthetic DNA for the storage of digital data, introducing a strong candidate to achieve data longevity. The DNA's biological properties allows the storage of a great amount of information into an extraordinary small volume while also promising efficient storage for centuries or even longer with no loss of information. However, encoding digital data onto DNA is not obvious, because when decoding, we have to face the problem of sequencing noise robustness. Furthermore, synthesizing DNA is an expensive process and thus, controlling the compression ratio by optimizing the rate-distortion trade-off is an important challenge we have to deal with. This work proposes a coding solution for the storage of digital images onto synthetic DNA. We developed a new encoding algorithm which generates a DNA code robust to biological errors coming from the synthesis and the sequencing processes. Furthermore, thanks to an optimized allocation process the solution is able to control the compression ratio and thus the length of the synthesized DNA strand. Results show an improvement in terms of coding potential compared to previous state-of-the-art works.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1904.03024/full.md

## Figures

7 figures with captions in the complete paper: https://tomesphere.com/paper/1904.03024/full.md

## References

8 references — full list in the complete paper: https://tomesphere.com/paper/1904.03024/full.md

---
Source: https://tomesphere.com/paper/1904.03024