Statistical Phylogenetic Tree Analysis Using Differences of Means
Elissaveta Arnaoudova, David Haws, Peter Huggins, Jerzy W., Jaromczyk, Neil Moore, Chris Schardl, Ruriko Yoshida

TL;DR
This paper introduces a statistical method for comparing phylogenetic tree distributions to detect significant incongruences, aiding in understanding genome evolution events like horizontal gene transfer.
Contribution
It presents a novel statistical approach using difference of means and kernel embeddings to compare tree distributions, with applications to real biological data and a supporting toolkit.
Findings
Successfully distinguishes gene tree sets under different models
Detects unusual genome evolution events in real datasets
Provides a computational toolkit for phylogenetic analysis
Abstract
We propose a statistical method to test whether two phylogenetic trees with given alignments are significantly incongruent. Our method compares the two distributions of phylogenetic trees given by the input alignments, instead of comparing point estimations of trees. This statistical approach can be applied to gene tree analysis for example, detecting unusual events in genome evolution such as horizontal gene transfer and reshuffling. Our method uses difference of means to compare two distributions of trees, after embedding trees in a vector space. Bootstrapping alignment columns can then be applied to obtain p-values. To compute distances between means, we employ a "kernel trick" which speeds up distance calculations when trees are embedded in a high-dimensional feature space, e.g. splits or quartets feature space. In this pilot study, first we test our statistical method's ability to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenetic diversity and population structure · Genomics and Phylogenetic Studies · Plant Taxonomy and Phylogenetics
