Importance Sampling Approximation of Sequence Evolution Models with Site-Dependence
Joseph Mathews, Scott C. Schmidler

TL;DR
This paper introduces an importance sampling algorithm for complex sequence evolution models with site-dependent rates, providing practical bounds and demonstrating applicability in phylogenetics.
Contribution
The paper presents a novel randomized importance sampling method for sequence evolution models with site dependence, with proven error bounds and practical complexity analysis.
Findings
Complexity depends on observed mutations, not sequence length.
Algorithm is practical for many real-world phylogenetic problems.
Provides problem-specific bounds for a known phylogenetics model.
Abstract
We consider models for molecular sequence evolution in which the transition rates at each site depend on the local sequence context, giving rise to a time-inhomogeneous Markov process in which sites evolve under a complex dependency structure. We introduce a randomized approximation algorithm for the marginal sequence likelihood under these models using importance sampling, and provide matching order upper and lower bounds on the finite sample approximation error. Given two sequences of length with observed mutations, we show that for practical regimes of , the complexity of the importance sampler does not grow exponentially , but rather in , making the algorithm practical for many applied problems. We demonstrate the use of our techniques to obtain problem-specific complexity bounds for a well-known dependent-site model from the phylogenetics literature.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
