Achievable Rates of Nanopore-based DNA Storage
By: Brendon McBain, Emanuele Viterbo
Potential Business Impact:
Stores lots of data in tiny DNA strands.
This paper studies achievable rates of nanopore-based DNA storage when nanopore signals are decoded using a tractable channel model that does not rely on a basecalling algorithm. Specifically, the noisy nanopore channel (NNC) with the Scrappie pore model generates average output levels via i.i.d. geometric sample duplications corrupted by i.i.d. Gaussian noise (NNC-Scrappie). Simplified message passing algorithms are derived for efficient soft decoding of nanopore signals using NNC-Scrappie. Previously, evaluation of this channel model was limited by the lack of DNA storage datasets with nanopore signals included. This is solved by deriving an achievable rate based on the dynamic time-warping (DTW) algorithm that can be applied to genomic sequencing datasets subject to constraints that make the resulting rate applicable to DNA storage. Using a publicly-available dataset from Oxford Nanopore Technologies (ONT), it is demonstrated that coding over multiple DNA strands of $100$ bases in length and decoding with the NNC-Scrappie decoder can achieve rates of at least $0.64-1.18$ bits per base, depending on the channel quality of the nanopore that is chosen in the sequencing device per channel-use, and $0.96$ bits per base on average assuming uniformly chosen nanopores. These rates are pessimistic since they only apply to single reads and do not include calibration of the pore model to specific nanopores.
Similar Papers
Block Length Gain for Nanopore Channels
Information Theory
Stores more computer data safely in DNA.
NEURODNAAI: Neural pipeline approaches for the advancing dna-based information storage as a sustainable digital medium using deep learning framework
Emerging Technologies
Stores lots of computer data safely in tiny DNA.
Complex DNA Synthesis Sequences
Information Theory
Stores way more information in tiny DNA bits.