Estimators for Substitution Rates in Genomes from Read Data
Shiv Pratap Singh Rathore, Navin Kashyap

TL;DR
This paper develops and analyzes estimators for mutation rates between genomes using noisy sequencing reads, extending alignment-free methods to more realistic read-based data with theoretical and simulation validation.
Contribution
It introduces new estimators for substitution rates from read data, providing theoretical guarantees and simulation evaluations in a sequencing context.
Findings
Proposed multiple estimators for mutation rate estimation.
Provided theoretical guarantees for one estimator.
Evaluated estimators through simulations.
Abstract
We study the problem of estimating the mutation rate between two sequences from noisy sequencing reads. Existing alignment-free methods typically assume direct access to the full sequences. We extend these methods to the sequencing framework, where only noisy reads from the sequences are observed. We use a simple model in which both mutations and sequencing errors are substitutions. We propose multiple estimators, provide theoretical guarantees for one of them, and evaluate the others through simulations.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenomics and Phylogenetic Studies · Cancer Genomics and Diagnostics · RNA and protein synthesis mechanisms
