Note on expected internode distances for gene trees in species trees
Martin Kreidl

TL;DR
This paper provides a rigorous proof that the expected internode distances derived from gene trees under the multispecies coalescent model form a tree-like distance matrix with the same topology as the true species tree, supporting their use in species tree estimation.
Contribution
It offers a formal proof confirming that average internode distances from gene trees accurately reflect the species tree topology under the multispecies coalescent model.
Findings
Expected internode distances form a tree-like matrix
The topology of the distance matrix matches the true species tree
Supports using neighbor joining for species tree estimation
Abstract
In a recent paper on 'Estimating Species Trees from Unrooted Gene Trees' Liu and Yu observe that the distance matrix on the underlying taxon set, which is built up from expected internode distances on gene trees under the multispecies coalescent, is tree-like, and that the underlying additive tree has the same topology as the true species tree. Hence they suggest to use (observed) average internode distances on gene trees as an input for the neighbor joining algorithm to estimate the underlying species tree in a statistically consistent way. In this note we give a rigorous proof of their above mentioned observation.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBioinformatics and Genomic Networks · Gene expression and cancer classification · Gene Regulatory Network Analysis
