TL;DR
This paper analyzes the asymptotic behavior of cost functionals in the coalescent model with recurrent mutation, providing new insights into importance sampling algorithms and their performance in large samples.
Contribution
It introduces a large-sample theoretical framework for coalescent importance sampling, revealing differences from standard methods and guiding computational optimization.
Findings
Cost functionals converge to tractable processes as sample size grows.
Importance sampling algorithms behave differently from standard sequential samplers.
Resampling can be detrimental to algorithm performance in certain settings.
Abstract
The coalescent is a foundational model of latent genealogical trees under neutral evolution, but suffers from intractable sampling probabilities. Methods for approximating these sampling probabilities either introduce bias or fail to scale to large sample sizes. We show that a class of cost functionals of the coalescent with recurrent mutation and a finite number of alleles converge to tractable processes in the infinite-sample limit. A particular choice of costs yields insight about importance sampling methods, which are a classical tool for coalescent sampling probability approximation. These insights reveal that the behaviour of coalescent importance sampling algorithms differs markedly from standard sequential importance samplers, with or without resampling. We conduct a simulation study to verify that our asymptotics are accurate for algorithms with finite (and moderate) sample…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
