Estimators for Substitution Rates in Genomes from Read Data
By: Shiv Pratap Singh Rathore, Navin Kashyap
We study the problem of estimating the mutation rate between two sequences from noisy sequencing reads. Existing alignment-free methods typically assume direct access to the full sequences. We extend these methods to the sequencing framework, where only noisy reads from the sequences are observed. We use a simple model in which both mutations and sequencing errors are substitutions. We propose multiple estimators, provide theoretical guarantees for one of them, and evaluate the others through simulations.
Similar Papers
On the Reliability of Information Retrieval From MDS Coded Data in DNA Storage
Information Theory
Stores more information safely in DNA.
Bayesian inference from time series of allele frequency data using exact simulation techniques
Populations and Evolution
Tracks how genes change over time.
Error-Correcting Codes for Labeled DNA Sequences
Information Theory
Fixes mistakes when reading DNA labels.