On the number of ranked species trees producing anomalous ranked gene trees
Filippo Disanto, Noah A. Rosenberg

TL;DR
This paper investigates the prevalence of anomalous ranked gene trees (ARGTs) in species trees, showing that as the number of species grows, nearly all such trees produce ARGTs, which are more probable than the species tree itself.
Contribution
It provides exact enumerations and asymptotic estimates for classes of ranked species trees, demonstrating that ARGT-producing trees dominate as species count increases.
Findings
Fraction of ARGT-producing trees approaches 1 with more species
Exact enumeration and asymptotic estimates for tree classes
Extends previous existence results to probabilistic frequency
Abstract
Analysis of probability distributions conditional on species trees has demonstrated the existence of anomalous ranked gene trees (ARGTs), ranked gene trees that are more probable than the ranked gene tree that accords with the ranked species tree. Here, to improve the characterization of ARGTs, we study enumerative and probabilistic properties of two classes of ranked labeled species trees, focusing on the presence or avoidance of certain subtree patterns associated with the production of ARGTs. We provide exact enumerations and asymptotic estimates for cardinalities of these sets of trees, showing that as the number of species increases without bound, the fraction of all ranked labeled species trees that are ARGT-producing approaches 1. This result extends beyond earlier existence results to provide a probabilistic claim about the frequency of ARGTs.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic processes and statistical mechanics · Bioinformatics and Genomic Networks · Gene expression and cancer classification
