Multi Loci Phylogenetic Analysis with Gene Tree Clustering
Ruriko Yoshida, Kenji Fukumizu, Chrysafis Vogiatzis

TL;DR
This paper demonstrates that the normalized cut clustering framework efficiently and accurately clusters gene trees based on their topological similarities in BHV space, outperforming other methods in phylogenetic analysis.
Contribution
It introduces the application of the normalized cut framework to gene tree clustering using BHV space, showing its superior performance over traditional clustering techniques.
Findings
Ncut accurately clusters gene trees in simulated data.
Ncut outperforms hierarchical clustering and is comparable to k-means.
NJp method shows better performance than MLE.
Abstract
Summary: Both theory and empirical evidence indicate that phylogenies (trees) of different genes (loci) do not display precisely matched topologies. This phylogenetic incongruence is attributed to the reticulated evolutionary history of most species due to meiotic sexual recombination in eukaryotes, or horizontal transfers of genetic materials in prokaryotes. Nonetheless, most genes do display topologically related phylogenies; this implies they form cohesive subsets (clusters). In this work, we compare popular clustering methods, and show how the performance of the normalized cut framework is efficient and statistically accurate when obtaining clusters on the set of gene trees based on the geodesic distance between them over the Billera-Holmes-Vogtmann (BHV) tree space. We proceed to present a computational study on the performance of different clustering methods with and without…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenomics and Phylogenetic Studies · Genetic diversity and population structure · Fractal and DNA sequence analysis
