Gromov-Hausdorff stability of linkage-based hierarchical clustering methods
A. Mart\'inez-P\'erez

TL;DR
This paper investigates the stability of linkage-based hierarchical clustering methods using the Gromov-Hausdorff metric, showing they are semi-stable under certain conditions and generally unstable with unchaining modifications.
Contribution
It provides a theoretical analysis of the stability properties of linkage-based hierarchical clustering, highlighting conditions for semi-stability and the effects of unchaining conditions.
Findings
Standard linkage methods are semi-stable near ultrametric spaces.
Introducing unchaining conditions generally causes instability.
Most exotic examples are exceptions to the stability results.
Abstract
A hierarchical clustering method is stable if small perturbations on the data set produce small perturbations in the result. These perturbations are measured using the Gromov-Hausdorff metric. We study the problem of stability on linkage-based hierarchical clustering methods. We obtain that, under some basic conditions, standard linkage-based methods are semi-stable. This means that they are stable if the input data is close enough to an ultrametric space. We prove that, apart from exotic examples, introducing any unchaining condition in the algorithm always produces unstable methods.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopological and Geometric Data Analysis · Data Management and Algorithms · advanced mathematical theories
