An Improved Bipartition Cover Bound for the Multispecies Coalescent Model
Zachary McNulty

TL;DR
This paper derives improved upper bounds on the number of loci needed to ensure bipartition coverage in gene trees under the multispecies coalescent model, enhancing theoretical understanding and practical applicability.
Contribution
It provides sharper, more realistic bounds on locus requirements for bipartition cover, expanding their empirical relevance and advancing coalescent theory analysis.
Findings
Bounds are below biologically realistic locus numbers across many parameters.
New asymptotics for coalescence times under Kingman's coalescent.
Simulation comparisons show improved bounds over previous work.
Abstract
Bipartition cover probabilities quantify whether a collection of gene trees contains every bipartition of the underlying species tree, a condition that underlies finite-sample guarantees for summary methods such as ASTRAL. We study this problem under the multispecies coalescent (MSC) model and derive topology-free upper bounds on the number of loci required to obtain a bipartition cover with prescribed confidence, improving upon the existing bounds of Uricchio et al. (2016). Practically, our bounds remain below biologically realistic numbers of loci across a substantially broader range of parameter settings, expanding their usefulness for empirical datasets. Theoretically, our analysis sharpens our understanding of coalescence under the MSC model and develops new asymptotics for these bounds and absorption times under Kingman's coalescent in the natural short branch regime. We further…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
