$k$-Means Clustering for Persistent Homology
Yueqi Cao, Prudence Leung, Anthea Monod

TL;DR
This paper proves the convergence of the $k$-means clustering algorithm in the complex space of persistence diagrams used in topological data analysis, and demonstrates its effectiveness through numerical experiments.
Contribution
It establishes theoretical convergence and properties of $k$-means clustering directly on persistence diagram space, a novel approach in topological data analysis.
Findings
$k$-means converges on persistence diagram space.
Clustering on diagrams outperforms vectorized representations.
Numerical experiments validate theoretical results.
Abstract
Persistent homology is a methodology central to topological data analysis that extracts and summarizes the topological features within a dataset as a persistence diagram; it has recently gained much popularity from its myriad successful applications to many domains. However, its algebraic construction induces a metric space of persistence diagrams with a highly complex geometry. In this paper, we prove convergence of the -means clustering algorithm on persistence diagram space and establish theoretical properties of the solution to the optimization problem in the Karush--Kuhn--Tucker framework. Additionally, we perform numerical experiments on various representations of persistent homology, including embeddings of persistence diagrams as well as diagrams themselves and their generalizations as persistence measures; we find that -means clustering performance directly on persistence…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopological and Geometric Data Analysis
