Bayesian genome assembly and assessment by Markov Chain Monte Carlo sampling
Mark Howison, Felipe Zapata, Erika J. Edwards, Casey W. Dunn

TL;DR
This paper introduces a Bayesian genome assembly method using Markov Chain Monte Carlo sampling to generate probability distributions over possible assemblies, enabling better uncertainty assessment.
Contribution
It presents a novel Bayesian approach with MCMC sampling for genome assembly, moving beyond point estimates to probabilistic hypothesis evaluation.
Findings
Provides posterior probability distributions for assemblies
Demonstrates application on bacteriophage PhiX174
Enables explicit assessment of assembly uncertainty
Abstract
Most genome assemblers construct point estimates, choosing a genome sequence from among many alternative hypotheses that are supported by the data. We present a Markov Chain Monte Carlo approach to sequence assembly that instead generates distributions of assembly hypotheses with posterior probabilities, providing an explicit statistical framework for evaluating alternative hypotheses and assessing assembly uncertainty. We implement this approach in a prototype assembler and illustrate its application to the bacteriophage PhiX174.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
