MQ-Coder inspired arithmetic coder for synthetic DNA data storage (2306.12708v1)
Abstract: Over the past years, the ever-growing trend on data storage demand, more specifically for "cold" data (i.e. rarely accessed), has motivated research for alternative systems of data storage. Because of its biochemical characteristics, synthetic DNA molecules are now considered as serious candidates for this new kind of storage. This paper introduces a novel arithmetic coder for DNA data storage, and presents some results on a lossy JPEG 2000 based image compression method adapted for DNA data storage that uses this novel coder. The DNA coding algorithms presented here have been designed to efficiently compress images, encode them into a quaternary code, and finally store them into synthetic DNA molecules. This work also aims at making the compression models better fit the problematic that we encounter when storing data into DNA, namely the fact that the DNA writing, storing and reading methods are error prone processes. The main take away of this work is our arithmetic coder and it's integration into a performant image codec.
- “Next-generation digital information storage in dna,” Science, vol. 337, no. 6102, pp. 1628–1628, 2012.
- “Towards practical, high-capacity, low-maintenance information storage in synthesized dna,” Nature, 2013.
- “Dna-aeon provides flexible arithmetic coding for constraint adherence and error correction in dna storage,” Nature Communications, 2023.
- “Jpeg2000 image compression fundamentals, standards and practice,” Springer.
- “Portable and error-free dna-based data storage,” Nature, 2017.
- “Mesa: automated assessment of synthetic dna fragments and simulation of dna synthesis, storage, sequencing and pcr errors,” Bioinformatics, vol. 36, pp. 3322–3326, 2020.
- “Dna storage error simulator: A tool for simulating errors in synthesis, storage, pcr and sequencing,” .
- “Dnasmart: Multiple attribute ranking tool for dna data storage systems,” Computational and Structural Biotechnology Journal, vol. 21, 2023.
- “A biologically constrained encoding solution for long-term storage of images onto synthetic dna,” European Signal Processing Conference (EUSIPCO), 2019.
- “Towards effective visual information storage on dna support,” Applications of Digital Image Processinf XLV, 2022.
- “A constrained shannon-fano entropy coder for image storage in synthetic dna,” European Signal Processing Conference (EUSIPCO), 2022.
- “Efficient classification of dna reads for robust decoding of data stored in synthetic dna,” Munich Workshop on Coding and Cryptography (MWCC 2022), 2022.
- “Image coding algorithm for dna data storage combining jpeg and autoencoders,” Munich Workshop on Coding and Cryptography (MWCC 2022), 2022.
- “Image coding using wavelet transform,” IEEE Transactions on Image Processing, vol. 1, pp. 205–220, 1992.
- “High performance scalable image compression with ebcot,” IEEE International Conference on Multimedia and Expo, 2007.
- “A jpeg-based image coding solution for data storage on dna,” European Signal Processing Conference (EUSIPCO), 2021.