Weighted Statistical Binning: enabling statistically consistent genome-scale phylogenetic analyses
Md. Shamsuzzoha Bayzid, Siavash Mirarab, Bastien Boussau, Tandy Warnow

TL;DR
This paper introduces weighted statistical binning, a method that improves the accuracy and statistical consistency of genome-scale species tree estimation under the multi-species coalescent model.
Contribution
It proposes a weighted approach to statistical binning that ensures statistical consistency while maintaining empirical accuracy in species tree estimation.
Findings
Weighted statistical binning is statistically consistent.
The method improves accuracy in genome-scale phylogenetic analyses.
It maintains empirical performance comparable to unweighted binning.
Abstract
Because biological processes can make different loci have different evolutionary histories, species tree estimation requires multiple loci from across the genome. While many processes can result in discord between gene trees and species trees, incomplete lineage sorting (ILS), modeled by the multi-species coalescent, is considered to be a dominant cause for gene tree heterogeneity. Coalescent-based methods have been developed to estimate species trees, many of which operate by combining estimated gene trees, and so are called summary methods. Because summary methods are generally fast, they have become very popular techniques for estimating species trees from multiple loci. However, recent studies have established that summary methods can have reduced accuracy in the presence of gene tree estimation error, and also that many biological datasets have substantial gene tree estimation…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
