On estimating the effective sample size of phylogenetic trees in an autocorrelated chain
Jonathan Klawitter, Lars Berling, Jordan Douglas, Dong Xie, Alexei J. Drummond

TL;DR
This paper compares existing and novel methods for estimating the effective sample size of phylogenetic trees in MCMC, highlighting challenges due to high-dimensionality and autocorrelation, and providing guidance for reliable estimation.
Contribution
It introduces new tree ESS estimators based on tractable distributions and clade frequency differences, and evaluates their accuracy and computational efficiency.
Findings
CCD-based estimators perform well with lower variance.
Probabilistic estimator is computationally expensive for long chains.
Multimodality affects ESS estimation accuracy.
Abstract
Estimating the effective sample size (ESS) is fundamental in Bayesian phylogenetic inference to properly account for autocorrelation in MCMC samples. While methods for continuous parameters are well established, the discrete and high-dimensional nature of treespace poses substantial challenges. Here, we compare existing tree ESS estimators with novel approaches that leverage tractable tree distributions, specifically Conditional Clade Distributions (CCDs), as well as a new probabilistic estimator based on clade frequency differences between independent chains. Using simulated chains with known ESS bounds, we assess estimator accuracy and evaluate their stability and robustness on simulated and real datasets. We further examine how multimodality in posterior distributions and poor mixing can substantially affect ESS estimates, highlighting the need for careful interpretation. Our…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEvolution and Paleontology Studies · Genomics and Phylogenetic Studies · Genetic diversity and population structure
