The site frequency spectrum for general coalescents
Jeffrey P. Spence, John A. Kamm, Yun S. Song

TL;DR
This paper introduces a fast, efficient method to compute the expected site frequency spectrum for general coalescent models, including time-inhomogeneous cases, improving inference in population genetics.
Contribution
It derives a new formula and algorithm for the expected SFS under general $ ext{Lambda}$- and $ ext{Xi}$-coalescents, including time-inhomogeneous models, with significant computational improvements.
Findings
Algorithm runs in $O(n^2)$ time for time-homogeneous models
Algorithm extends to time-inhomogeneous models with $O(n^3)$ runtime
Theoretical results on identifiability of coalescent measures and functions
Abstract
General genealogical processes such as - and -coalescents, which respectively model multiple and simultaneous mergers, have important applications in studying marine species, strong positive selection, recurrent selective sweeps, strong bottlenecks, large sample sizes, and so on. Recently, there has been significant progress in developing useful inference tools for such general models. In particular, inference methods based on the site frequency spectrum (SFS) have received noticeable attention. Here, we derive a new formula for the expected SFS for general - and -coalescents, which leads to an efficient algorithm. For time-homogeneous coalescents, the runtime of our algorithm for computing the expected SFS is , where is the sample size. This is a factor of faster than the state-of-the-art method. Furthermore, in contrast to existing…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
