Computational Tools for Evaluating Phylogenetic and Hierarchical Clustering Trees
John Chakerian, Susan Holmes

TL;DR
This paper introduces a geometric distance measure for trees applicable to phylogenetic and hierarchical clustering trees, enabling improved statistical inference and evaluation of tree stability and influence in biological and data mining contexts.
Contribution
It implements the BHV geometric distance for trees, demonstrating its applications in statistical inference and tree stability analysis across various fields.
Findings
The BHV distance effectively compares phylogenetic and clustering trees.
Multidimensional scaling approximates treespace for visualization.
The method assesses influence of variables and observations in hierarchical clustering.
Abstract
Inferential summaries of tree estimates are useful in the setting of evolutionary biology, where phylogenetic trees have been built from DNA data since the 1960's. In bioinformatics, psychometrics and data mining, hierarchical clustering techniques output the same mathematical objects, and practitioners have similar questions about the stability and `generalizability' of these summaries. This paper provides an implementation of the geometric distance between trees developed by Billera, Holmes and Vogtmann (2001) [BHV] equally applicable to phylogenetic trees and hieirarchical clustering trees, and shows some of the applications in statistical inference for which this distance can be useful. In particular, since BHV have shown that the space of trees is negatively curved (a CAT(0) space), a natural representation of a collection of trees is a tree. We compare this representation to the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMorphological variations and asymmetry · Evolution and Paleontology Studies · Gene expression and cancer classification
