Fitting trees to $\ell_1$-hyperbolic distances
Joon-Hyeok Yim, Anna C. Gilbert

TL;DR
This paper introduces a new approach to fit trees to hyperbolic distances by analyzing hyperbolicity vectors and their norms, providing bounds on embedding errors and revealing differences in real-world versus synthetic hierarchical data.
Contribution
It develops an algorithm that bounds tree embedding error using the $ ext{ell}_1$ norm of hyperbolicity, offering a novel theoretical framework and empirical insights into hierarchical data.
Findings
The $ ext{ell}_1$ error is tightly bounded by the hyperbolicity vector's $ ext{ell}_1$ norm.
Standard datasets for hierarchical analysis differ significantly from synthetic tree-like data.
The proposed algorithm outperforms Gromov's classical results in both theory and practice.
Abstract
Building trees to represent or to fit distances is a critical component of phylogenetic analysis, metric embeddings, approximation algorithms, geometric graph neural nets, and the analysis of hierarchical data. Much of the previous algorithmic work, however, has focused on generic metric spaces (i.e., those with no a priori constraints). Leveraging several ideas from the mathematical analysis of hyperbolic geometry and geometric group theory, we study the tree fitting problem as finding the relation between the hyperbolicity (ultrametricity) vector and the error of tree (ultrametric) embedding. That is, we define a vector of hyperbolicity (ultrametric) values over all triples of points and compare the norms of this vector with the norm of the distortion of the best tree fit to the distances. This formulation allows us to define the average hyperbolicity…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsMathematical Dynamics and Fractals · Geometric and Algebraic Topology · Topological and Geometric Data Analysis
