Principal component analysis and the locus of the Frechet mean in the space of phylogenetic trees
Tom M. W. Nye, Xiaoxian Tang, Grady Weyenberg, Ruriko Yoshida

TL;DR
This paper extends principal component analysis to the non-Euclidean space of phylogenetic trees by introducing a geometric object based on the Fréchet mean, enabling better visualization and analysis of complex biological data.
Contribution
It proposes a novel geometric approach for higher-order principal components in tree-space using the locus of the weighted Fréchet mean, with an efficient projection algorithm.
Findings
Algorithms perform well in simulations
Applied to gene trees and genome data
Reveals structure within complex phylogenetic data
Abstract
Most biological data are multidimensional, posing a major challenge to human comprehension and computational analysis. Principal component analysis is the most popular approach to rendering two- or three-dimensional representations of the major trends in such multidimensional data. The problem of multidimensionality is acute in the rapidly growing area of phylogenomics. Evolutionary relationships are represented by phylogenetic trees, and very typically a phylogenomic analysis results in a collection of such trees, one for each gene in the analysis. Principal component analysis offers a means of quantifying variation and summarizing a collection of phylogenies by dimensional reduction. However, the space of all possible phylogenies on a fixed set of species does not form a Euclidean vector space, so principal component analysis must be reformulated in the geometry of tree-space, which…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenomics and Phylogenetic Studies · Genetic diversity and population structure · Evolution and Paleontology Studies
